Learning bilingual translations from comparable corpora to cross-language information retrieval

Fatiha Sadat; Masatoshi Yoshikawa; Shunsuke Uemura

doi:10.3115/1118935.1118943

ScienceGate Book Chapters

JOURNAL ARTICLE

Learning bilingual translations from comparable corpora to cross-language information retrieval

Fatiha Sadat Masatoshi Yoshikawa Shunsuke Uemura

Year: 2003 Vol: 11 Pages: 57-64

DOI: 10.3115/1118935.1118943

Get Full-Text PDF Get Analytical Report

Abstract

Recent years saw an increased interest in the use and the construction of large corpora.With this increased interest and awareness has come an expansion in the application to knowledge acquisition and bilingual terminology extraction.The present paper will seek to present an approach to bilingual lexicon extraction from non-aligned comparable corpora, combination to linguisticsbased pruning and evaluations on Cross-Language Information Retrieval.We propose and explore a two-stages translation model for the acquisition of bilingual terminology from comparable corpora, disambiguation and selection of best translation alternatives on the basis of their morphological knowledge.Evaluations using a large-scale test collection on Japanese-English and different weighting schemes of SMART retrieval system confirmed the effectiveness of the proposed combination of two-stages comparable corpora and linguistics-based pruning on Cross-Language Information Retrieval.

Keywords:

Computer science Natural language processing Artificial intelligence Terminology Pruning Cross-language information retrieval Text corpus Information extraction Selection (genetic algorithm) Lexicon Information retrieval Machine translation Linguistics

Metrics

Cited By

2.68

FWCI (Field Weighted Citation Impact)

Refs

0.91

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Biomedical Text Mining and Ontologies

Life Sciences → Biochemistry, Genetics and Molecular Biology → Molecular Biology

Learning bilingual translations from comparable corpora to cross-language information retrieval

Abstract

Metrics

Citation History

Topics

Related Documents

Extracting translations from comparable corpora for Cross-Language Information Retrieval using the language modeling framework

Japanese-English Cross Language Information Retrieval based on Comparable Corpora and Bilingual Dictionary

Bilingual terminology acquisition from comparable corpora and phrasal translation to cross-language information retrieval

Exploiting Comparable Corpora for Cross-Language Information Retrieval

Enhancing cross-language information retrieval by an automatic acquisition of bilingual terminology from comparable corpora