Van-Hai VuQuang-Phuoc NguyenKiem-Hieu NguyenJoon-Choul ShinCheol-Young Ock
Since deep learning was introduced, a series of achievements has been published in the field of automatic machine translation (MT). However, Korean-Vietnamese MT systems face many challenges because of a lack of data, multiple meanings of individual words, and grammatical diversity that depends on context. Therefore, the quality of Korean-Vietnamese MT systems is still sub-optimal. This paper discusses a method for applying Named Entity Recognition (NER) and Part-of-Speech (POS) tagging to Vietnamese sentences to improve the performance of Korean-Vietnamese MT systems. In terms of implementation, we used a tool to tag NER and POS in Vietnamese sentences. In addition, we had access to a Korean-Vietnamese parallel corpus with more than 450K paired sentences from our previous research paper. The experimental results indicate that tagging NER and POS in Vietnamese sentences can improve the quality of Korean-Vietnamese Neural MT (NMT) in terms of the Bi-Lingual Evaluation Understudy (BLEU) and Translation Error Rate (TER) score. On average, our MT system improved by 1.21 BLEU points or 2.33 TER scores after applying both NER and POS tagging to the Vietnamese corpus. Due to the structural features of language, the MT systems in the Korean to Vietnamese direction always give better BLEU and TER results than translation machines in the reverse direction.
J FinkelC ManningV YadavS BethardP KoehnM JohnsonE MatusovP WilkenY GeorgakopoulouM GraaY KimJ SchamperS KhadiviH NeyN PhuocC.-Y OckM.-T LuongC ManningQ NgoW WiniwarterG LampleM BallesterosS SubramanianK KawakamiC DyerS HochreiterJ SchmidhuberH MayerF GomezD WierstraI NagyA KnollJ SchmidhuberF.-F LiP PeronaG QiuL GetoorB TaskarQ.-P NguyenA.-D VoJ.-C ShinP TranC.-Y OckT LuongH PhamC ManningA VaswaniH BahuleyanL MouO VechtomovaP PoupartS HochreiterH KamigaitoK HayashiT HiraoH TakamuraM OkumuraM NagataG KleinY KimY DengJ SenellartA RushD NguyenD NguyenT VuM DrasM JohnsonK PapineniS RoukosT WardW.-J ZhuM SnoverB DorrR SchwartzL Micciulla
Marco GaidoSara PapiMatteo NegriMarco Turchi
Dat Ba NguyenSon Huu HoangSon Bao PhamThai Phuong Nguyen