Recently, automatic text categorization has made rapid progress and been one of the hotspots in the information processing field. Text tendency classification is one type of text categorization, which has very important applications in information retrievals bad information identification and filtering , content security management and analysis of public opinion tendency. To aim at the important influence of feature selection on text classification accuracy, this paper mainly studied feature selection method of tendency classification. First, to analyze and summarize the current variety methods, it points out three common ideas of feature selection. Then based on the analysis of complexity of tendency classification, it is proved that feature selection method based on the features' distribution in text categories is more suitable for tendency classification than the method based on the correlativity of features and categories. Finally, it gives test results for balanced training sets and unbalanced training sets.
Jian SunXiang ZhangDan LiaoVictor Chang
Kewei ShenLe KeJing MaKuo ZhangYueming LuXian Chen
Durmuş Özkan Şahi̇nNurullah AteşErdal Kılıç