JOURNAL ARTICLE

Sprinkling Topics for Weakly Supervised Text Classification

Abstract

Supervised text classification algorithms require a large number of documents labeled by humans, that involve a laborintensive and time consuming process.In this paper, we propose a weakly supervised algorithm in which supervision comes in the form of labeling of Latent Dirichlet Allocation (LDA) topics.We then use this weak supervision to "sprinkle" artificial words to the training documents to identify topics in accordance with the underlying class structure of the corpus based on the higher order word associations.We evaluate this approach to improve performance of text classification on three real world datasets.

Keywords:
Computer science Artificial intelligence Natural language processing Machine learning Pattern recognition (psychology)

Metrics

14
Cited By
1.93
FWCI (Field Weighted Citation Impact)
22
Refs
0.89
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

BOOK-CHAPTER

Weakly Supervised Text Classification

Yu Meng

Synthesis lectures on data mining and knowledge discovery Year: 2019 Pages: 49-70
JOURNAL ARTICLE

Weakly-Supervised Hierarchical Text Classification

Meng YuJiaming ShenChao ZhangJiawei Han

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2019 Vol: 33 (01)Pages: 6826-6833
BOOK-CHAPTER

Weakly Supervised Hierarchical Text Classification

Meng Yu

Synthesis lectures on data mining and knowledge discovery Year: 2019 Pages: 71-87
JOURNAL ARTICLE

Weakly-supervised Text Classification Based on Keyword Graph

Lu ZhangJiandong DingYi XuYingyao LiuShuigeng Zhou

Journal:   Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing Year: 2021
© 2026 ScienceGate Book Chapters — All rights reserved.