Hugo L. HammerAnis YazidiAleksander BaiPaal Engelstad
Classifying tweets is an intrinsically hard task as tweets are\nshort messages which makes traditional bags of words based\napproach ine cient. In fact, bags of words approaches ig-\nnores relationships between important terms that do not\nco-occur literally.\nIn this paper we resort to word-word co-occurence informa-\ntion from a large corpus to expand the vocabulary of another\ncorpus consisting of tweets. Our results show that we are\nable to reduce the number of erroneous classi cations by\n14% using co-occurence information.
Hugo L. HammerAnis YazidiAleksander BaiPaal Engelstad
Kazuhiro MoritaEl‐Sayed AtlamMasao FuketraKazuhiko TsudaMasaki OonoJun‐ichi Aoe
Sungwon SeoJong‐Kook KimLynn Choi
Sungwon SeoJong‐Kook KimSung-Il KimJeewoo KimJoongheon Kim