JOURNAL ARTICLE

Weighted Naive Bayes Text Classification Algorithm Based on Poisson Distribution

ZHAO Bowen, WANG Lingjiao, GUO Hua

Year: 2020 Journal:   DOAJ (DOAJ: Directory of Open Access Journals)

Abstract

Naive Bayes(NB) algorithm is simple and efficient when applied to text classification,but it has a bottleneck in accuracy due to the intrinsic assumption that attribute independence and attribute importance are consistent.To solve this problem,this paper proposes a feature-weighted NB text classification algorithm based on Poisson distribution.The algorithm combines the Poisson distribution model with the NB algorithm,and the Poisson random variable is introduced into the weight of feature words.On this basis,the Information Gain Ratio(IGR) is defined to weigh the feature words of texts,and thus the effects of the attribute independence assumption of traditional algorithms can be reduced.Experimental results on the 20-newsgroups data set show that,compared with NB algorithm and its improved algorithms RW,C-MNB and CFSNB,this algorithm can improve the accuracy rate,recall rate and F1 value of text classification.Meanwhile,its execution efficiency is higher than K-Nearest Neighbor(KNN) algorithm and Support Vector Machine(SVM) algorithm.

Keywords:
Poisson distribution Independence (probability theory) Feature (linguistics) Naive Bayes classifier Set (abstract data type) Random variable Pattern recognition (psychology) Data set Bottleneck

Metrics

1
Cited By
0.15
FWCI (Field Weighted Citation Impact)
0
Refs
0.61
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Sentiment Analysis and Opinion Mining
Physical Sciences →  Computer Science →  Artificial Intelligence
Big Data and Digital Economy
Physical Sciences →  Computer Science →  Information Systems

Related Documents

JOURNAL ARTICLE

A New Naive Bayes Text Classification Algorithm

Liguo DuanPeng DiAiping Li

Journal:   TELKOMNIKA Indonesian Journal of Electrical Engineering Year: 2013 Vol: 12 (2)
JOURNAL ARTICLE

A Chinese text classification system based on Naive Bayes algorithm

Wei Cui

Journal:   MATEC Web of Conferences Year: 2016 Vol: 44 Pages: 01015-01015
© 2026 ScienceGate Book Chapters — All rights reserved.