JOURNAL ARTICLE

Statistical machine translation using hierarchical phrase alignment

Taro WatanabeKenji ImamuraEiichiro SumitaHiroshi G. Okuno

Year: 2007 Journal:   Systems and Computers in Japan Vol: 38 (6)Pages: 70-79   Publisher: Wiley

Abstract

Abstract The following three problems are known to exist with statistical machine translation. (1) the modeling problem involved in prescribing translation relations, (2) the problem of determining parameter settings from a text corpus of translations, and (3) the search problem involved in determining the output text (the translation) given a statistical model and an input text. In this paper we find alignments of translations using phrase‐based units in a hierarchical fashion with the intention of solving the above‐mentioned modeling and training problems with such hierarchical phrase alignments. As an initial method we perform chunking on the corpus on the basis of these hierarchical alignments, and create translation models using these chunks as translation units. Then, as a second method we convert the translation relations expressed in the hierarchical phrase alignments into correspondences in the translation model, and perform additional training having initialized the model parameters to values obtained from these relations. The results of experiments with Japanese‐to‐English translation show that both methods improve performance with the second method being particularly effective resulting in an increase in translation rate from 61.3% to 70.0%. © 2007 Wiley Periodicals, Inc. Syst Comp Jpn, 38(6): 70–79, 2007; Published online in Wiley InterScience ( www.interscience. wiley.com ). DOI 10.1002/scj.20271

Keywords:
Phrase Computer science Translation (biology) Machine translation Example-based machine translation Natural language processing Artificial intelligence Chunking (psychology) Statistical model Machine translation software usability Evaluation of machine translation Hierarchical database model Data mining

Metrics

2
Cited By
0.00
FWCI (Field Weighted Citation Impact)
9
Refs
0.51
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Biomedical Text Mining and Ontologies
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
© 2026 ScienceGate Book Chapters — All rights reserved.