Among the problems caused by spam email are loss of productivity and increase in network resources consumption. Sometimes spam email contain malware as attachments or include links for phishing websites, leading to theft and loss of data. Many email servers are filtering spam but the process becomes increasingly difficult as spammers try to create messages that look similar to normal email. In this paper we implemented five Machine Learning Algorithms in the Python language using the scikit-learn library and we compared their performance against two publicly available spam email corpuses. The discussed algorithms are: Support Vector Machine, Random Forest, Logistic Regression, Multinomial Naive Bayes and Gaussian Naive Bayes.
Nikhil KumarSanket SonowalNISHANT NISHANT
Gayatri GattaniShamla MantriSeema Nayak