Naive Bayes classifier has good performance on many datasets, however, the performance is very poor on some datasets which have a strong correlation between attributes due to the conditional independence assumption is not always true in the real world. In the latest Hidden Naive Bayes (HNB) algorithm, each attribute corresponds to a hidden parent which combines the influences of all other attributes. Compared to other Bayesian algorithms, its performance is significantly improved, but too much test time on high-dimensional datasets cost. In this paper, to find the optimal combination between Naive Bayes and HNB, a novel model Packaged Hidden Naive Bayes (PHNB), which the number of attributes in the hidden parent is controlled through packaging idea, is proposed. Our experiments show that compared to HNB, PHNB significantly reduces the test time on many high-dimensional datasets, and has higher accuracy on some particular datasets.
Liangxiao JiangH. ZhangZhihua Cai