JOURNAL ARTICLE

Pre-trained Language Model Based Active Learning for Sentence Matching

Abstract

Active learning is able to significantly reduce the annotation cost for data-driven techniques. However, previous active learning approaches for natural language processing mainly depend on the entropy-based uncertainty criterion, and ignore the characteristics of natural language. In this paper, we propose a pre-trained language model based active learning approach for sentence matching. Differing from previous active learning, it can provide linguistic criteria from the pre-trained language model to measure instances and help select more effective instances for annotation. Experiments demonstrate our approach can achieve greater accuracy with fewer labeled training instances.

Keywords:
Computer science Artificial intelligence Sentence Annotation Natural language processing Matching (statistics) Active learning (machine learning) Natural language Language model Entropy (arrow of time) Machine learning Mathematics

Metrics

8
Cited By
0.88
FWCI (Field Weighted Citation Impact)
28
Refs
0.79
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Machine Learning and Algorithms
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Using Pre-trained Language Model to Enhance Active Learning for Sentence Matching

Guirong BaiShizhu HeKang LiuJun Zhao

Journal:   ACM Transactions on Asian and Low-Resource Language Information Processing Year: 2021 Vol: 21 (2)Pages: 1-19
JOURNAL ARTICLE

Schema matching based on energy domain pre-trained language model

Zhiyu PanMuchen YangAntonello Monti

Journal:   Energy Informatics Year: 2023 Vol: 6 (S1)
JOURNAL ARTICLE

Talent Supply and Demand Matching Based on Prompt Learning and the Pre-Trained Language Model

Kuo LiJianhua LiuCunbo Zhuang

Journal:   Applied Sciences Year: 2025 Vol: 15 (5)Pages: 2536-2536
JOURNAL ARTICLE

A pre-trained data deduplication model based on active learning

Haochen ShiXinyao LiuFengmao LvHongtao XueJie HuShengdong DuTianrui Li

Journal:   Expert Systems with Applications Year: 2025 Vol: 292 Pages: 128628-128628
JOURNAL ARTICLE

Active Learning on Pre-trained Language Model with Task-Independent Triplet Loss

Seungmin SeoDonghyun KimYoubin AhnKyong-Ho Lee

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2022 Vol: 36 (10)Pages: 11276-11284
© 2026 ScienceGate Book Chapters — All rights reserved.