Due to the increase in cyber-attacks and the dynamic nature of technology and malware, there is a need to develop a working model capable of detecting malicious files based on certain features. The project used the drebin-215-dataset-5560malware-9476-benign.csv dataset, it is the collection of a diverse dataset of both malware and benign samples that include different types of malware. Feature extraction techniques are used to capture relevant attributes from samples, including file system activities, network traffic, and more. Subsequently, a number of machine learning algorithms such as Decision Tree, Random Forest, Support Vector Machine (SVM), K-Nearest Neighbour (KNN), Logistic Regression and Convolutional Neural Networks, they are trained and evaluated on the extracted features to classify the samples as malicious or benign. The evaluation process involves assessing the performance of each algorithm in terms of accuracy, precision, recall and F1 score. In addition, the models are tested for their ability to generalize to unseen data and resist overfitting. A comparative analysis is performed to identify the most effective malware detection algorithm based on the characteristics of the dataset. The results of this project provide insight into the effectiveness of various machine learning techniques for malware detection and contribute to the development of more robust and proactive cyber security solutions. By leveraging machine learning, organizations can improve their ability to detect and mitigate malware threats in real-time, thereby strengthening the overall security posture of their systems and networks
Anne Yeswanth SaiB. Neela Konda ReddyK AmarendraN. Venkata Ramana Gupta
Sakshi JoshiSantosh MahagaonkarSantosh Mahagaonkar
P.S. RaghuvanshiJyoti Prakash Singh