Neural CRF Model for Sentence Alignment in Text Simplification

Chao Jiang; Mounica Maddela; Wuwei Lan; Zhong Yang; Wei Xu

doi:10.18653/v1/2020.acl-main.709

ScienceGate Book Chapters

JOURNAL ARTICLE

Neural CRF Model for Sentence Alignment in Text Simplification

Chao Jiang Mounica Maddela Wuwei Lan Zhong Yang Wei Xu

Year: 2020 Pages: 7943-7960

DOI: 10.18653/v1/2020.acl-main.709

Get Full-Text PDF Get Analytical Report

Abstract

The success of a text simplification system heavily depends on the quality and quantity of complex-simple sentence pairs in the training corpus, which are extracted by aligning sentences between parallel articles. To evaluate and improve sentence alignment quality, we create two manually annotated sentence-aligned datasets from two commonly used text simplification corpora, Newsela and Wikipedia. We propose a novel neural CRF alignment model which not only leverages the sequential nature of sentences in parallel documents but also utilizes a neural sentence pair model to capture semantic similarity. Experiments demonstrate that our proposed approach outperforms all the previous work on monolingual sentence alignment task by more than 5 points in F1. We apply our CRF aligner to construct two new text simplification datasets, Newsela-Auto and Wiki-Auto, which are much larger and of better quality compared to the existing datasets. A Transformer-based seq2seq model trained on our datasets establishes a new state-of-the-art for text simplification in both automatic and human evaluation.

Keywords:

Computer science Sentence Natural language processing Transformer Artificial intelligence Text simplification Task (project management) Construct (python library)

Metrics

100

Cited By

12.19

FWCI (Field Weighted Citation Impact)

Refs

0.99

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Text Readability and Simplification

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Neural CRF Model for Sentence Alignment in Text Simplification

Abstract

Metrics

Citation History

Topics

Related Documents

Sentence Alignment Methods for Improving Text Simplification Systems

Text-Based English-Arabic Sentence Alignment

Text-Based English-Arabic Sentence Alignment

Improving Human Text Simplification with Sentence Fusion

Improving Human Text Simplification with Sentence Fusion