JOURNAL ARTICLE

Short Text Embedding for Clustering Based on Word and Topic Semantic Information

Abstract

Short text clustering is used in various applications and becomes a significant problem, while it also is a challenging task due to the sparsity problem of traditional short text representations. Early methods either cause waste of space or ignore the order of word sequence. To tackle these problems, a self-taught convolutional neural network model is proposed to construct short text representations. However, it extracts the semantic information only from the word context without any other unsupervised features and ignores the different contributions of textual content in clustering. In this paper, we propose an effective short text embedding method for clustering based on word and topic semantic information (STE-WT). Taking advantage of the topic semantic information and capturing the differences in the contributions of the content by an attention mechanism, our proposed model successfully constructs much better short text representations for clustering. Extensive experimental results on real datasets demonstrate the effectiveness and superiority of our framework compared with state-of-the-art methods.

Keywords:
Computer science Cluster analysis Word embedding Artificial intelligence Natural language processing Word (group theory) Document clustering Context (archaeology) Construct (python library) Task (project management) Embedding Information retrieval Mathematics

Metrics

2
Cited By
0.31
FWCI (Field Weighted Citation Impact)
48
Refs
0.68
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Probabilistic topic modeling for short text based on word embedding networks

Marcelo PitaMatheus NunesGisele L. Pappa

Journal:   Applied Intelligence Year: 2022 Vol: 52 (15)Pages: 17829-17844
JOURNAL ARTICLE

Short Text Classification Based on Latent Topic Modeling and Word Embedding

Peng LiJunqing HeChenglong Ma

Journal:   DEStech Transactions on Computer Science and Engineering Year: 2017
BOOK-CHAPTER

Text Semantic Steganalysis Based on Word Embedding

Xin ZuoHuanhuan HuWeiming ZhangNenghai Yu

Lecture notes in computer science Year: 2018 Pages: 485-495
© 2026 ScienceGate Book Chapters — All rights reserved.