Efficient Pre-training of Masked Language Model via Concept-based Curriculum Masking

Mingyu Lee; Jun-Hyung Park; Junho Kim; Kang-Min Kim; SangKeun Lee

doi:10.18653/v1/2022.emnlp-main.502

ScienceGate Book Chapters

JOURNAL ARTICLE

Efficient Pre-training of Masked Language Model via Concept-based Curriculum Masking

Mingyu Lee Jun-Hyung Park Junho Kim Kang-Min Kim SangKeun Lee

Year: 2022 Pages: 7417-7427

DOI: 10.18653/v1/2022.emnlp-main.502

Get Full-Text PDF Get Analytical Report

Abstract

Self-supervised pre-training has achieved remarkable success in extensive natural language processing tasks. Masked language modeling (MLM) has been widely used for pre-training effective bidirectional representations but comes at a substantial training cost. In this paper, we propose a novel concept-based curriculum masking (CCM) method to efficiently pre-train a language model. CCM has two key differences from existing curriculum learning approaches to effectively reflect the nature of MLM. First, we introduce a novel curriculum that evaluates the MLM difficulty of each token based on a carefully-designed linguistic difficulty criterion. Second, we construct a curriculum that masks easy words and phrases first and gradually masks related ones to the previously masked ones based on a knowledge graph. Experimental results show that CCM significantly improves pre-training efficiency. Specifically, the model trained with CCM shows comparative performance with the original BERT on the General Language Understanding Evaluation benchmark at half of the training cost.

Keywords:

Computer science Curriculum Masking (illustration) Language model Benchmark (surveying) Construct (python library) Artificial intelligence Security token Natural language processing Key (lock) Training (meteorology) Graph Machine learning Theoretical computer science Programming language Psychology

Metrics

Cited By

0.99

FWCI (Field Weighted Citation Impact)

Refs

0.74

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Efficient Pre-training of Masked Language Model via Concept-based Curriculum Masking

Abstract

Metrics

Citation History

Topics

Related Documents

Efficient Vision-Language Pre-Training by Cluster Masking

Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity Disambiguation

Masked Latent Semantic Modeling: an Efficient Pre-training Alternative to Masked Language Modeling

SMAUG: Sparse Masked Autoencoder for Efficient Video-Language Pre-training

Length-Based Curriculum Learning for Efficient Pre-training of Language Models