JOURNAL ARTICLE

Self-Paced Pairwise Representation Learning for Semi-Supervised Text Classification

Abstract

Text classification is one vital tool assisting web content mining. Semi-supervised text classification (SSTC) offers an approach to alleviate the burden of annotation costs by training on a few labeled texts alongside many unlabeled texts. Unsolved challenges in SSTC are the overfitting problem caused by the limited labeled data and the mislabeling problem of unlabeled texts. To address these issues, this paper proposes a Self-Paced PairWise representation learning (SPPW) model. Concretely, SPPW alleviates the overfitting problem by replacing the overfitting-prone learning of a parameterized classifier with representation learning in a pair-wise manner. Besides, we propose a novel self-paced text filtering method that effectively integrates both label confidence and text hardness to reduce mislabeled texts synergistically. Extensive experiments on 3 benchmark SSTC datasets show that SPPW outperforms baselines and is effective in mitigating overfitting and mislabeling problems.

Keywords:
Overfitting Pairwise comparison Artificial intelligence Computer science Machine learning Classifier (UML) Annotation Representation (politics) Parameterized complexity Feature learning Semi-supervised learning Benchmark (surveying) Pattern recognition (psychology) Artificial neural network

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
22
Refs
0.05
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Sentiment Analysis and Opinion Mining
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Adaptive Graph Learning for Semi-supervised Self-paced Classification

Long ChenJianbo Lu

Journal:   Neural Processing Letters Year: 2021 Vol: 54 (4)Pages: 2695-2716
JOURNAL ARTICLE

Structure regularized self-paced learning for robust semi-supervised pattern classification

Nannan GuPengying FanMingyu FanDi Wang

Journal:   Neural Computing and Applications Year: 2018 Vol: 31 (10)Pages: 6559-6574
JOURNAL ARTICLE

Self-Supervised Contrastive Representation Learning for Semi-Supervised Time-Series Classification

Emadeldeen EldeleMohamed RagabZhenghua ChenMin WuChee Keong KwohXiaoli LiCuntai Guan

Journal:   IEEE Transactions on Pattern Analysis and Machine Intelligence Year: 2023 Vol: 45 (12)Pages: 15604-15618
© 2026 ScienceGate Book Chapters — All rights reserved.