BOOK-CHAPTER

Improved Statistical Machine Translation Using Monolingual Paraphrases

Abstract

We propose a novel monolingual sentence paraphrasing method for augmenting the training data for statistical machine translation systems “for free” – by creating it from data that is already available rather than having to create more aligned data. Starting with a syntactic tree, we recursively generate new sentence variants where noun compounds are paraphrased using suitable prepositions, and vice-versa – preposition-containing noun phrases are turned into noun compounds. The evaluation shows an improvement equivalent to 33%–50% of that of doubling the amount of training data.

Keywords:
Translation (biology) Machine translation Computer science Natural language processing Artificial intelligence Linguistics Chemistry Philosophy

Metrics

2
Cited By
0.00
FWCI (Field Weighted Citation Impact)
21
Refs
0.26
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Text Readability and Simplification
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.