Abstract

We provide several methods for sentence alignment of texts with different complexity levels. Using the best of them, we sentence-align the Newsela corpora, thus providing large training materials for automatic text simplification (ATS) systems. We show that using this dataset, even the standard phrase-based statistical machine translation models for ATS can outperform \nthe state-of-the-art ATS systems.

Keywords:
Sentence Computer science Volume (thermodynamics) Natural language processing Artificial intelligence Computational linguistics Linguistics Philosophy Physics

Metrics

23
Cited By
2.06
FWCI (Field Weighted Citation Impact)
28
Refs
0.89
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Text Readability and Simplification
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.