JOURNAL ARTICLE

Static Analysis for Malware Classification Using Machine and Deep Learning

Abstract

Malware, or malicious software, is a general term to describe any program or code that can be harmful to systems. This hostile, intrusive, and intentionally harmful code makes use of a variety of techniques to protect and evade detection and removal through code obfuscation, polymorphism, metamorphism, encryption, encrypted communication, and more. Current state-of-the-art research focuses on the application of artificial intelligence techniques for the detection and classification of malware. In this context, this paper proposes a new malware classification through static analysis using seven machine learning algorithms (LightGBM, XGBoost, Logistic Regression, KNN, SVM, Naive Bayes, and Random Forest) and deep learning finetuning. These models make use of the SelectKBest technique within data engineering, allowing the selection of the 893 most relevant characteristics for the classification of 10868 malware in 9 families, reducing overfitting and training time. The results show that the application of Gradient Boosting algorithms such as LightGBM with hyperparameter optimization exceeds the reference results in competitions such as Kaggle, with a logarithmic loss 0.00118, an accuracy close to 100%, and prediction times less than 2.3ms. Fast enough to be applied to systems in real time to classify malware.

Keywords:
Computer science Malware Machine learning Artificial intelligence Random forest Overfitting Naive Bayes classifier Support vector machine Hyperparameter Feature selection Context (archaeology) Deep learning Feature engineering Statistical classification Data mining Artificial neural network Computer security

Metrics

2
Cited By
0.54
FWCI (Field Weighted Citation Impact)
20
Refs
0.61
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Malware Detection Techniques
Physical Sciences →  Computer Science →  Signal Processing
Network Security and Intrusion Detection
Physical Sciences →  Computer Science →  Computer Networks and Communications
Digital and Cyber Forensics
Physical Sciences →  Computer Science →  Information Systems
© 2026 ScienceGate Book Chapters — All rights reserved.