Feature selection is a vital preprocessing step before utilizing any machine learning algorithm. It aims at reducing the number of features in the dataset by removing irrelevant, noisy, and redundant features. The feature selection problem can be viewed as an optimization problem where the goal is to maximize or minimize an evaluation measure for the machine learning tasks, mainly classification tasks. Metaheuristic algorithms are optimization algorithms that can be applied to feature selection. In this research, a comparison between the wrapper feature selection model based on the Differential Evolution (DE) and filter methods like Chi2 and ReliefF is conducted to evaluate both approaches. Three classification algorithms k-Nearest Neighbors (KNN), Support Vector Machines (SVM), and Decision Trees (DT) are used to evaluate the utilized feature selection algorithms. The proposed model is tested on a recent malware dataset obtained from the UCI repository. The results show that DT achieves the highest accuracy and consistently performs well in both wrapper and filter feature selection techniques. Thus, DT can be considered the most effective algorithm for the given dataset. However, SVM and KNN also offer viable alternatives depending on specific requirements or preferences.
Liuzhi YinYong GeKeli XiaoXuehua WangXiaojun Quan
Jason Van HulseTaghi M. KhoshgoftaarAmri NapolitanoRandall Wald
Maied Ayash AlanaziMaheyzah Md SirajFuad A. Ghaleb