Text and image based spam email classification using KNN, Naïve Bayes and Reverse DBSCAN algorithm

Anirudh Harisinghaney; Aman Dixit; Saurabh Gupta; Anuja Arora

doi:10.1109/icroit.2014.6798302

ScienceGate Book Chapters

JOURNAL ARTICLE

Text and image based spam email classification using KNN, Naïve Bayes and Reverse DBSCAN algorithm

Anirudh Harisinghaney Aman Dixit Saurabh Gupta Anuja Arora

Year: 2014

DOI: 10.1109/icroit.2014.6798302

Get Full-Text PDF Get Analytical Report

Abstract

Internet has changed the way of communication, which has become more and more concentrated on emails. Emails, text messages and online messenger chatting have become part and parcel of our lives. Out of all these communications, emails are more prone to exploitation. Thus, various email providers employ algorithms to filter emails based on spam and ham. In this research paper, our prime aim is to detect text as well as image based spam emails. To achieve the objective we applied three algorithms namely: KNN algorithm, Naïve Bayes algorithm and reverse DBSCAN algorithm. Pre-processing of email text before executing the algorithms is used to make them predict better. This paper uses Enron corpus's dataset of spam and ham emails. In this research paper, we provide comparison performance of all three algorithms based on four measuring factors namely: precision, sensitivity, specificity and accuracy. We are able to attain good accuracy by all the three algorithms. The results have shown comparison of all three algorithms applied on same data set.

Keywords:

Computer science Naive Bayes classifier Algorithm Statistical classification The Internet Machine learning Filter (signal processing) DBSCAN Artificial intelligence Data mining Information retrieval Cluster analysis Support vector machine World Wide Web

Metrics

Cited By

8.88

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Spam and Phishing Detection

Physical Sciences → Computer Science → Information Systems

Text and Document Classification Technologies

Physical Sciences → Computer Science → Artificial Intelligence

Network Security and Intrusion Detection

Physical Sciences → Computer Science → Computer Networks and Communications

Text and image based spam email classification using KNN, Naïve Bayes and Reverse DBSCAN algorithm

Abstract

Metrics

Citation History

Topics

Related Documents

Email spam classification using neighbor probability based Naïve Bayes algorithm

Probability-based Naïve Bayes Algorithm for Email Spam Classification

Automatic Email Spam Classification Using Naïve Bayes

Email Spam Detection using Naïve Bayes Algorithm

Classification of Spam Email Using Intelligent Water Drops Algorithm with Naïve Bayes Classifier