Mohammed ChemmakhaOmar HabibiMohamed Lazaar
Machine learning performance always rely on relevant phase of pre-processing, that includes dataset cleaning, cleansing and extraction. Feature selection (FS) is a crucial phase too, because it is intended to increase the efficiency of Machine Learning (ML) models in terms of predictiveness, by assigning a representative value to the most important features in a dataset of malware. In this study, we focus on feature selection using embedded-based methods in order to minimize computational time and complexity of ML models. Embedded-based methods combine advantages of both filter-based and wrapped-based methods, in terms of studying the importance of features while executing the model and their reduced time of execution. Applying ML models shows a high stability of models will selecting 10 most relevant features from the dataset, with an accuracy that achieve 99.47%, 99.02% for respectively Random Forest (RF) and XGBoost (XGB).
Inam Ullah KhanFida Muhammad KhanZeeshan Ali HaiderSaba KhattakGulshan NaheedSana Shaoor Kiani
Anisha MahatoRana MajumdarSwarup Kr Ghosh
Rakibul HasanBarna BiswasMd SamiunMd. Abu SalehMani PrabhaJahanara AkterFatema Haque JoyaMasuk Abdullah
Namita DabasPrachi AhlawatPrabha Sharma