JOURNAL ARTICLE

Feature Selection for High-Dimensional Imbalanced Malware Data Using Filter and Wrapper Selection Methods

Abstract

Feature selection is a vital preprocessing step before utilizing any machine learning algorithm. It aims at reducing the number of features in the dataset by removing irrelevant, noisy, and redundant features. The feature selection problem can be viewed as an optimization problem where the goal is to maximize or minimize an evaluation measure for the machine learning tasks, mainly classification tasks. Metaheuristic algorithms are optimization algorithms that can be applied to feature selection. In this research, a comparison between the wrapper feature selection model based on the Differential Evolution (DE) and filter methods like Chi2 and ReliefF is conducted to evaluate both approaches. Three classification algorithms k-Nearest Neighbors (KNN), Support Vector Machines (SVM), and Decision Trees (DT) are used to evaluate the utilized feature selection algorithms. The proposed model is tested on a recent malware dataset obtained from the UCI repository. The results show that DT achieves the highest accuracy and consistently performs well in both wrapper and filter feature selection techniques. Thus, DT can be considered the most effective algorithm for the given dataset. However, SVM and KNN also offer viable alternatives depending on specific requirements or preferences.

Keywords:
Feature selection Computer science Support vector machine Artificial intelligence Machine learning Filter (signal processing) Preprocessor Feature (linguistics) Data mining Data pre-processing Selection (genetic algorithm) Malware Pattern recognition (psychology) k-nearest neighbors algorithm Feature extraction

Metrics

5
Cited By
1.34
FWCI (Field Weighted Citation Impact)
24
Refs
0.77
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Malware Detection Techniques
Physical Sciences →  Computer Science →  Signal Processing
Network Security and Intrusion Detection
Physical Sciences →  Computer Science →  Computer Networks and Communications
Artificial Immune Systems Applications
Physical Sciences →  Engineering →  Biomedical Engineering
© 2026 ScienceGate Book Chapters — All rights reserved.