We provide several methods for sentence alignment of texts with different complexity levels. Using the best of them, we sentence-align the Newsela corpora, thus providing large training materials for automatic text simplification (ATS) systems. We show that using this dataset, even the standard phrase-based statistical machine translation models for ATS can outperform \nthe state-of-the-art ATS systems.
Max SchwarzerTeerapaun TanprasertDavid Kauchak
Max SchwarzerTeerapaun TanprasertDavid Kauchak
Chao JiangMounica MaddelaWuwei LanZhong YangWei Xu
Rafaella ValeRafael Dueire LinsRafael Ferreira
Rafaella ValeRafael Dueire LinsRafael Ferreira