Reinforced Active Learning for Low-Resource, Domain-Specific, Multi-Label Text Classification

Lukas Wertz; Jasmina Bogojeska; Кацярына Мирыленка; Jonas Kuhn

doi:10.18653/v1/2023.findings-acl.697

ScienceGate Book Chapters

JOURNAL ARTICLE

Reinforced Active Learning for Low-Resource, Domain-Specific, Multi-Label Text Classification

Lukas Wertz Jasmina Bogojeska Кацярына Мирыленка Jonas Kuhn

Year: 2023

DOI: 10.18653/v1/2023.findings-acl.697

Get Full-Text PDF Get Analytical Report

Abstract

Text classification datasets from specialised or technical domains are in high demand, especially in industrial applications. However, due to the high cost of annotation such datasets are usually expensive to create. While Active Learning (AL) can reduce the labeling cost, required AL strategies are often only tested on general knowledge domains and tend to use information sources that are not consistent across tasks. We propose Reinforced Active Learning (RAL) to train a Reinforcement Learning policy that utilizes many different aspects of the data and the task in order to select the most informative unlabeled subset dynamically over the course of the AL procedure. We demonstrate the superior performance of the proposed RAL framework compared to strong AL baselines across four intricate multi-class, multi-label text classification datasets taken from specialised domains. In addition, we experiment with a unique data augmentation approach to further reduce the number of samples RAL needs to annotate.

Keywords:

Computer science Reinforcement learning Active learning (machine learning) Task (project management) Annotation Domain (mathematical analysis) Class (philosophy) Machine learning Artificial intelligence Labeled data Multi-label classification Resource (disambiguation)

Metrics

Cited By

1.28

FWCI (Field Weighted Citation Impact)

Refs

0.79

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Machine Learning and Algorithms

Physical Sciences → Computer Science → Artificial Intelligence

Text and Document Classification Technologies

Physical Sciences → Computer Science → Artificial Intelligence

Oil and Gas Production Techniques

Physical Sciences → Engineering → Ocean Engineering

Reinforced Active Learning for Low-Resource, Domain-Specific, Multi-Label Text Classification

Abstract

Metrics

Citation History

Topics

Related Documents

Reinforced active learning for low-resource, domain-specific, multi-label text classification

Multi-domain active learning for text classification

Effective multi-label active learning for text classification

Active Learning Strategies for Multi-Label Text Classification

Deep active learning for multi label text classification