Pre-trained Language Model Based Active Learning for Sentence Matching

Guirong Bai; Shizhu He; Kang Liu; Jun Zhao; Zaiqing Nie

doi:10.18653/v1/2020.coling-main.130

ScienceGate Book Chapters

JOURNAL ARTICLE

Pre-trained Language Model Based Active Learning for Sentence Matching

Guirong Bai Shizhu He Kang Liu Jun Zhao Zaiqing Nie

Year: 2020 Pages: 1495-1504

DOI: 10.18653/v1/2020.coling-main.130

Get Full-Text PDF Get Analytical Report

Abstract

Active learning is able to significantly reduce the annotation cost for data-driven techniques. However, previous active learning approaches for natural language processing mainly depend on the entropy-based uncertainty criterion, and ignore the characteristics of natural language. In this paper, we propose a pre-trained language model based active learning approach for sentence matching. Differing from previous active learning, it can provide linguistic criteria from the pre-trained language model to measure instances and help select more effective instances for annotation. Experiments demonstrate our approach can achieve greater accuracy with fewer labeled training instances.

Keywords:

Computer science Artificial intelligence Sentence Annotation Natural language processing Matching (statistics) Active learning (machine learning) Natural language Language model Entropy (arrow of time) Machine learning Mathematics

Metrics

Cited By

0.88

FWCI (Field Weighted Citation Impact)

Refs

0.79

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Machine Learning and Algorithms

Physical Sciences → Computer Science → Artificial Intelligence

Pre-trained Language Model Based Active Learning for Sentence Matching

Abstract

Metrics

Citation History

Topics

Related Documents

Using Pre-trained Language Model to Enhance Active Learning for Sentence Matching

Schema matching based on energy domain pre-trained language model

Talent Supply and Demand Matching Based on Prompt Learning and the Pre-Trained Language Model

A pre-trained data deduplication model based on active learning

Active Learning on Pre-trained Language Model with Task-Independent Triplet Loss