Character Decomposition for Japanese-Chinese Character-Level Neural Machine Translation

Jinyi Zhang; Tadahiro Matsumoto

doi:10.1109/ialp48816.2019.9037677

ScienceGate Book Chapters

JOURNAL ARTICLE

Character Decomposition for Japanese-Chinese Character-Level Neural Machine Translation

Jinyi Zhang Tadahiro Matsumoto

Year: 2019 Pages: 35-40

DOI: 10.1109/ialp48816.2019.9037677

Get Full-Text PDF Get Analytical Report

Abstract

After years of development, Neural Machine Translation (NMT) has produced richer translation results than ever over various language pairs, becoming a new machine translation model with great potential. For the NMT model, it can only translate words/characters contained in the training data. One problem on NMT is handling of the low-frequency words/characters in the training data. In this paper, we propose a method for removing characters whose frequencies of appearance are less than a given minimum threshold by decomposing such characters into their components and/or pseudo-characters, using the Chinese character decomposition table we made. Experiments of Japanese-to-Chinese and Chinese-to-Japanese NMT with ASPEC-JC (Asian Scientific Paper Excerpt Corpus, Japanese-Chinese) corpus show that the BLEU scores, the training time and the number of parameters are varied with the number of the given minimum thresholds of decomposed characters.

Keywords:

Character (mathematics) Computer science Machine translation Artificial intelligence Natural language processing Kanji Translation (biology) Decomposition Chinese characters Table (database) Speech recognition Mathematics Data mining

Metrics

Cited By

0.31

FWCI (Field Weighted Citation Impact)

Refs

0.68

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Handwritten Text Recognition Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Character Decomposition for Japanese-Chinese Character-Level Neural Machine Translation

Abstract

Metrics

Citation History

Topics

Related Documents

Hybrid Attention for Chinese Character-Level Neural Machine Translation

Improving character-level Japanese-Chinese neural machine translation with radicals as an additional input feature

Fully Character-Level Neural Machine Translation without Explicit Segmentation

Compact and Robust Models for Japanese-English Character-level Machine Translation

Plan, Attend, Generate: Character-Level Neural Machine Translation with Planning