JOURNAL ARTICLE

Automatic Keyword Extraction Using Word Embedding and Clustering

Ping ZengQingping TanYing YanQinzheng XieJianjun XuWei Cao

Year: 2017 Journal:   2017 International Conference on Computer Systems, Electronics and Control (ICCSEC) Pages: 1402-1408

Abstract

Existing word-frequency-based algorithms for keyword extraction do not consider the semantic relationships among words. Moreover, word-graph-based algorithms cannot distinguish multiple topics, and topic-model-based algorithms possess high time complexity. All of these keyword extraction algorithms exhibit limitations. This paper proposes a new word-embedding-based algorithm, namely, WEC, for keyword extraction. The algorithm incorporates word frequency, effects of word co-occurrence, and semantic relationship among contexts. The algorithm also estimates the final weights of words with cosine similarity and pointwise mutual information and extracts topics by clustering. Experimental results show that the WEC algorithm outperforms state-of-the-art keyword extraction methods on four datasets when tested under various evaluation metrics.

Keywords:
Keyword extraction Computer science Cluster analysis Word embedding Word (group theory) Cosine similarity Artificial intelligence Semantic similarity Pointwise Pointwise mutual information Natural language processing Graph Word lists by frequency Embedding Data mining Mutual information Theoretical computer science Mathematics

Metrics

6
Cited By
0.14
FWCI (Field Weighted Citation Impact)
26
Refs
0.47
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Information Retrieval and Search Behavior
Physical Sciences →  Computer Science →  Information Systems
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.