JOURNAL ARTICLE

Optimizing Email Spam Classification Using Naïve Bayes and Principal Component Analysis

Shinta VirgianaRudi KurniawanTati Suprapti

Year: 2025 Journal:   Journal of Artificial Intelligence and Engineering Applications (JAIEA) Vol: 4 (2)Pages: 1029-1036

Abstract

In the ever-evolving digital era, email spam filtering is an important challenge to maintain the security and comfort of email services. The Naïve Bayes algorithm is widely used for spam email classification because of its ability to manage large data, although there are still limitations in terms of accuracy, precision and recall. This research aims to improve spam email classification performance by combining Naïve Bayes and Principal Component Analysis (PCA) to optimize model accuracy and explore optimal parameters in the reduction dimension. The research methodology goes through the Knowledge Discovery in Database (KDD) stages which include selection, preprocessing, transformation using PCA, development of a classification model using Naïve Bayes, and evaluation of model performance. The dataset used consists of emails categorized as spam and non-spam. The experimental results show that the combination of Naïve Bayes and PCA achieves the highest accuracy of 99.24% with 7 principal components. The fixed number of components approach shows better performance compared to preserving variance, emphasizing the importance of selecting appropriate PCA parameters in improving the effectiveness of model classification. This research shows that PCA not only reduces the complexity of the dataset but also increases the efficiency of the classification algorithm.

Keywords:
Principal component analysis Naive Bayes classifier Computer science Artificial intelligence Bayes' theorem Pattern recognition (psychology) Natural language processing Information retrieval Bayesian probability Support vector machine

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.04
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Spam and Phishing Detection
Physical Sciences →  Computer Science →  Information Systems
Internet Traffic Analysis and Secure E-voting
Physical Sciences →  Computer Science →  Artificial Intelligence
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Email Spam Detection using Naïve Bayes Algorithm

G. RevathiK. N. Brahmaji RaoG. Sita Ratnam

Journal:   International Journal for Research in Applied Science and Engineering Technology Year: 2022 Vol: 10 (9)Pages: 653-655
JOURNAL ARTICLE

Spam Email Detection using Naïve Bayes classifier

L. G. Wang

Journal:   ITM Web of Conferences Year: 2025 Vol: 70 Pages: 04028-04028
JOURNAL ARTICLE

Probability-based Naïve Bayes Algorithm for Email Spam Classification

A. SumithraA. AshifaS. HariniN. Kumaresan

Journal:   2022 International Conference on Computer Communication and Informatics (ICCCI) Year: 2022
© 2026 ScienceGate Book Chapters — All rights reserved.