Hotel online reviews have an important influence on other users' purchasing decisions, which also has the characteristics of large volume, fast growth, many types, and low value density. To assist consumers in purchasing decisions, an effective method is needed to quickly mine the true emotions of other customers from these review data, so as to conduct a classified evaluation of hotels, or provide recommendation basis. Therefore, a multinomial Naive Bayesian sentiment classification model for hotel reviews is proposed in this paper. On the basis of the characteristics of the hotel reviews, the stop word list was expanded by word frequency statistics. Removal of more irrelevant stop words is beneficial for noise information filtering during classification. Besides, aiming at the dimensional disaster problem caused by the bag of words model, the dimensionality reduction is carried out by using the information gain method. The experimental results show that after using the extended hotel stop word list, it has little effect on the classification results, but the efficiency has been improved obviously. Furthermore, the model can achieve better classification results than ordinary multinomial Naive Bayes classifiers.
Arif Abdurrahman FarisiYuliant SibaroniSaid Al Faraby