With the development of Internet, there are more and more text data appear, the companies face the challenge to organize the content and the users feel confused about what is useful content for them. If the text data can be classified will make a contribution to solve the problem. It has been a long time, text classification work is done by human beings, like editors. So text classification become a hot topic in nature language processing field, especially for Chinese text classification. Sentiment classification just need to classify two classes, but there are more situations where we need to do multi classification. Such as the news editors have to give an article tags manually. There are several ways to solve the text classification problem: (1) Naive Bayes algorithm (2) support vector machine algorithm (3) neural network (4) k nearest neighbors (5) decision tree [1][2][3][4][5]. Naive Bayes applies Bayes' theorem with strong(naive) independence assumptions between the features. This paper proposes to use Naive Bayes to finish a Chinese news text multi classification with nine classes.
Peng LiuHui-han ZhaoJia-yu TengYanyan YangYafeng LiuZongwei Zhu
Xinyi SunLiming DuYanshuang Wang