Hui HuangMingfeng YuShuai YuYongbin QinChuan Lin
Abstract Multi-label text classification aims to assign each document a subset of relevant labels from a predefined set, addressing the realistic scenario where texts can belong to multiple categories or topics. This task is challenging due to the need to capture rich semantic features and complex inter-label dependencies. Recent advances have incorporated label semantic information into text classification, improving performance, but two key issues remain: (i) label embeddings are often shallow and fail to capture fine-grained semantic relationships between labels and text; (ii) the implicit similarity between texts indicated by label co-occurrence is underutilized, making it difficult for models to capture higher-order label dependencies. In this paper, we propose a novel multi-label text classification model that addresses these issues via a dual-branch attention network enhanced with supervised contrastive learning. The model consists of a label attention branch that learns label-specific document representations by attending to label semantics, and a self-attention branch that captures global contextual features of the text; the two representations are fused to form a comprehensive document representation. To exploit label co-occurrence patterns, we introduce a label-guided contrastive learning objective that treats documents with overlapping labels as positive pairs and others as negatives, effectively learning to pull semantically related documents closer in the representation space and push unrelated ones apart. We conduct extensive experiments on four benchmark datasets—AAPD, RCV1, EUR-Lex, and Reuters-21578—and the results show that our approach attains competitive performance relative to state-of-the-art systems such as XML-CNN, EXAM, AttentionXML, LSAN, GATTN, and LDGN. Complementary ablation studies and hyper-parameter analyses corroborate the contribution of each module and clarify the influence of the contrastive-loss weight and temperature. Collectively, these findings highlight the effectiveness of coupling label-semantic enrichment with instance-level contrastive supervision for multi-label text classification, especially in settings characterized by large label spaces and pronounced label imbalance.
Yuchen LiHaoyi XiongLinghe KongJiang BianShuaiqiang Wang
Zhengzhong ZhuPei ZhouZeting LiKejiang ChenJiangping Zhu
Wei ZhangYun JiangYun FangShuai Pan