Sergio Ramírez‐GallegoIago LastraDavid Martínez‐RegoVerónica Bolón‐CanedoJosé M. BenítezFrancisco HerreraAmparo Alonso‐Betanzos
With the advent of large-scale problems, feature selection has become a fundamental preprocessing step to reduce input dimensionality. The minimum-redundancy-maximum-relevance (mRMR) selector is considered one of the most relevant methods for dimensionality reduction due to its high accuracy. However, it is a computationally expensive technique, sharply affected by the number of features. This paper presents fast-mRMR, an extension of mRMR, which tries to overcome this computational burden. Associated with fast-mRMR, we include a package with three implementations of this algorithm in several platforms, namely, CPU for sequential execution, GPU (graphics processing units) for parallel computing, and Apache Spark for distributed computing using big data technologies.