JOURNAL ARTICLE

Learning Label-Adaptive Representation for Large-Scale Multi-Label Text Classification

Peng ChengHaobo WangJue WangLidan ShouKe ChenGang ChenChang Yao

Year: 2024 Journal:   IEEE/ACM Transactions on Audio Speech and Language Processing Vol: 32 Pages: 2630-2640   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Large-scale multi-label text classification (LMTC) aims at tagging each text with multiple relevant labels from a large label space, which typically demonstrates high sparsity, diversity, and skewness. To learn text representations in LMTC, a straightforward strategy is to learn a single vector to represent the whole text, yet limiting good generalization to diverse labels; another popular one is to learn specific representation per label via attention weighting, but excessively emphasizing tail labels restricts the overall performance. To cope with these limitations, we propose a novel LMTC framework, dubbed LADAR, which learns label-adaptive text representations to ensure high performance on large-scale labels. Specifically, we construct a representation pool for each text by collecting multi-layer features of the deep model as well as multi-granularity features of the text. Furthermore, all labels are adaptively matched to their most relevant representations to predict the final scores. Experiments over five benchmark datasets demonstrate the LADAR achieves highly superior results to state-of-the-art LMTC approaches. In particular, LADAR achieves significantly better performance on tail labels, e.g., 5.09% relative improvement on PSP@5 on the Amazon-670K dataset than the best baseline.

Keywords:
Computer science Artificial intelligence Granularity Construct (python library) Representation (politics) Benchmark (surveying) Weighting Generalization Pattern recognition (psychology) Machine learning Scale (ratio) Mathematics

Metrics

3
Cited By
1.92
FWCI (Field Weighted Citation Impact)
33
Refs
0.81
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.