JOURNAL ARTICLE

Spam/Ham email classification using BERT

Siwei Zhang

Year: 2023 Journal:   Applied and Computational Engineering Vol: 6 (1)Pages: 1189-1195

Abstract

Email is a popular method for communicating with each other. However, as sending email is free of charge as long as an email server and a domain name are available, spam mail is becoming a critical problem in the email network. Conventionally, the industry uses spam filters based on rules and Bayesian inference to counteract spam mail, reaching an accuracy of 98.76%, which is far from satisfactory. Hence, to better protect email users from unsolicited messages containing advertisements, sensitive content, phishing content, and viruses, a new approach is proposed, in which email content is filtered by a spam detector using bidirectional encoder representations from transformers (BERT). BERT is a new language representation model published by Google that has achieved great success because of its powerful capabilities in understanding natural language. After the model is trained on a corpus from Kaggle, the spam detector equipped with the BERT model reaches a binary accuracy of 99.40% when classifying spam mail.

Keywords:
Computer science Phishing Forum spam Spamming Spambot Language model Inference Artificial intelligence Naive Bayes classifier Keyword spotting Information retrieval Natural language processing World Wide Web The Internet Support vector machine

Metrics

1
Cited By
0.62
FWCI (Field Weighted Citation Impact)
7
Refs
0.69
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Spam and Phishing Detection
Physical Sciences →  Computer Science →  Information Systems
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Sentiment Analysis and Opinion Mining
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Spam/Ham Email Classification Using Machine Learning

Rajeev YadavRamakant GzautamDimpy AroraPayal RathoreAkshat Pareek

Journal:   Industrial Engineering Journal Year: 2023 Vol: 52 Pages: 195-201
JOURNAL ARTICLE

SHED: Spam Ham Email Dataset

Upasna SharmaSurinder Singh Khurana

Journal:   International Journal on Recent and Innovation Trends in Computing and Communication Year: 2017 Vol: 5 (6)
JOURNAL ARTICLE

Spam or ham? [email protection filter]

Brynn Ellis

Journal:   Engineering & Technology Year: 2008 Vol: 3 (11)Pages: 32-33
© 2026 ScienceGate Book Chapters — All rights reserved.