JOURNAL ARTICLE

Weakly supervised text classification method based on transformer

Abstract

The seed word-driven approach based on weakly supervised text classification (WTC) is the dominant approach. In existing seed word-driven methods,using metrics such as Term Frequency (TF), Inverse Document Frequency (IDF) and its combinations to update the seed words. the method assigns the same weight to all metrics, leading to the selection of common or poorly differentiated words as seed words; In addition most of the text classifiers used in the study have difficulty in capturing the correlation and global information between text information. In order to solve the above problems, Using Transformer as a text classifier first, The multi-headed self-attention mechanism allows capturing longrange dependencies while computing in parallel and fully learning the global semantic information of the input text. Then an improved TF-IDF method is proposed to increase the weight of IDF so that some common words that affect the classification can be filtered out. Its experimental results are improved on 20News and NYT datasets.

Keywords:
Computer science tf–idf Artificial intelligence Transformer Classifier (UML) Natural language processing Machine learning Term (time)

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
4
Refs
0.02
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Network Security and Intrusion Detection
Physical Sciences →  Computer Science →  Computer Networks and Communications

Related Documents

BOOK-CHAPTER

Weakly Supervised Text Classification

Yu Meng

Synthesis lectures on data mining and knowledge discovery Year: 2019 Pages: 49-70
BOOK-CHAPTER

A Weakly Supervised Text Classification Method Based on Vocabulary Construction

Peidong LiDi LinZijian Li

Lecture notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering Year: 2023 Pages: 128-136
JOURNAL ARTICLE

Weakly-Supervised Hierarchical Text Classification

Meng YuJiaming ShenChao ZhangJiawei Han

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2019 Vol: 33 (01)Pages: 6826-6833
BOOK-CHAPTER

Weakly Supervised Hierarchical Text Classification

Meng Yu

Synthesis lectures on data mining and knowledge discovery Year: 2019 Pages: 71-87
JOURNAL ARTICLE

Weakly-supervised Text Classification Based on Keyword Graph

Lu ZhangJiandong DingYi XuYingyao LiuShuigeng Zhou

Journal:   Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing Year: 2021
© 2026 ScienceGate Book Chapters — All rights reserved.