JOURNAL ARTICLE

Online News Classification Using Multinomial Naive Bayes

Abstract

The huge availability of text in numerous forms is the valuable information resource that can be used for various purposes. One of the text mining methods to analyze text document is classification. Text classification is a process of grouping and categorizing a document based on the training models. This study aimed to categorize Indonesian news automatically using Multinomial Naive Bayes. To get more optimal result, feature selection process using Document Frequency Thresholding method and term weighting using Term Frequency-Inverse Document Frequency (TF-IDF) were applied. The experiment showed that Multinomial Naive Bayes with TF-IDF produced the highest average accuracy to 86,62 % while Multinomial Naive Bayes reached 86,28%, Multinomial Naive Bayes with DF-Thresholding-TFIDF to 86,15% and Multinomial Naive Bayes with DF-Thresholding to 85,98%. Feature selection with Document Frequency Thresholding is quite efficient to reduce the number of data dimension shown with the result of insignificant final accuracy from Multinomial Naive Bayes method.

Keywords:
Naive Bayes classifier Multinomial distribution Computer science tf–idf Thresholding Artificial intelligence Feature selection Pattern recognition (psychology) Categorization Term (time) Mathematics Statistics Support vector machine

Metrics

38
Cited By
5.60
FWCI (Field Weighted Citation Impact)
0
Refs
0.96
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Data Mining and Machine Learning Applications
Physical Sciences →  Computer Science →  Information Systems
Edcuational Technology Systems
Physical Sciences →  Computer Science →  Artificial Intelligence
Information Retrieval and Data Mining
Physical Sciences →  Computer Science →  Information Systems
© 2026 ScienceGate Book Chapters — All rights reserved.