Adapting Hidden Naive Bayes for Text Classification

Shengfeng Gan; Shiqi Shao; Long Chen; Liangjun Yu; Liangxiao Jiang

doi:10.3390/math9192378

ScienceGate Book Chapters

JOURNAL ARTICLE

Adapting Hidden Naive Bayes for Text Classification

Shengfeng Gan Shiqi Shao Long Chen Liangjun Yu Liangxiao Jiang

Year: 2021 Journal: Mathematics Vol: 9 (19)Pages: 2378-2378 Publisher: Multidisciplinary Digital Publishing Institute

DOI: 10.3390/math9192378

Get Full-Text PDF Get Analytical Report

Abstract

Due to its simplicity, efficiency, and effectiveness, multinomial naive Bayes (MNB) has been widely used for text classification. As in naive Bayes (NB), its assumption of the conditional independence of features is often violated and, therefore, reduces its classification performance. Of the numerous approaches to alleviating its assumption of the conditional independence of features, structure extension has attracted less attention from researchers. To the best of our knowledge, only structure-extended MNB (SEMNB) has been proposed so far. SEMNB averages all weighted super-parent one-dependence multinomial estimators; therefore, it is an ensemble learning model. In this paper, we propose a single model called hidden MNB (HMNB) by adapting the well-known hidden NB (HNB). HMNB creates a hidden parent for each feature, which synthesizes all the other qualified features’ influences. For HMNB to learn, we propose a simple but effective learning algorithm without incurring a high-computational-complexity structure-learning process. Our improved idea can also be used to improve complement NB (CNB) and the one-versus-all-but-one model (OVA), and the resulting models are simply denoted as HCNB and HOVA, respectively. The extensive experiments on eleven benchmark text classification datasets validate the effectiveness of HMNB, HCNB, and HOVA.

Keywords:

Naive Bayes classifier Complement (music) Computer science Artificial intelligence Conditional independence Machine learning Independence (probability theory) Feature (linguistics) Benchmark (surveying) Estimator Multinomial distribution Pattern recognition (psychology) Mathematics Support vector machine Statistics

Metrics

Cited By

1.83

FWCI (Field Weighted Citation Impact)

Refs

0.88

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Bayesian Modeling and Causal Inference

Physical Sciences → Computer Science → Artificial Intelligence

Text and Document Classification Technologies

Physical Sciences → Computer Science → Artificial Intelligence

Advanced Text Analysis Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Adapting Hidden Naive Bayes for Text Classification

Abstract

Metrics

Citation History

Topics

Related Documents

Adapting naive Bayes tree for text classification

Text classification and Naive Bayes

Negation Naive Bayes for Text Classification

A New Naive Bayes Text Classification Algorithm

Internet traffic classification using Hidden Naive Bayes model