Named Entities are often rare words, and their transliteration across languages has been a challenging task. In this paper, we study a novel technique that segments a named entity into a sequence sub-words or characters. We propose to learn the transliteration mechanism using a sequence-to-sequence neural network. Applying the proposed technique to personal named transliteration on LDC dataset, we show impressive results with more than 10 BLEU score improvement over the competing statistic method on the same corpus.
Zeqi TanYongliang ShenShuai ZhangWeiming LüYueting Zhuang
Asif EkbalSudip Kumar NaskarSivaji Bandyopadhyay
Aye MyatMonKhin SoeC DingW PaM UtiyamaE SumitaY ThuW PaY SagisakaN IwahashiAshish VaswaniNoam ShazeerNiki ParmarJakob UszkoreitLlion JonesAidan Gomezukasz KaiserIllia PolosukhinY SinT OoH MoW PaK SoeY ThuF OchH NeyP KoehnF OchD MarcuK PapineniS RoukosT WardW.-J Zhu
Roman GrundkiewiczKenneth Heafield