JOURNAL ARTICLE

Word-embedding based bilingual terminology alignment

Repar, AndražMartinc, MatejUlčar, MatejPollak, Senja

Year: 2021 Journal:   Zenodo (CERN European Organization for Nuclear Research)   Publisher: European Organization for Nuclear Research

Abstract

The ability to accurately align concepts between languages can provide significant benefits in many practical applications. In this paper, we extend a machine learning approach using dictionary and cognate-based features with novel cross-lingual embedding features using pretrained fastText embeddings. We use the tool VecMap to align the embeddings between Slovenian and English and then for every word calculate the top 3 closest word embeddings in the opposite language based on cosine distance. These alignments are then used as features for the machine learning algorithm. With one configuration of the input parameters, we managed to improve the overall F-score compared to previous work, while another configuration yielded improved precision (96%) at a cost of lower recall. Using embedding-based features as a replacement for dictionary-based features provides a significant benefit: while a large bilingual parallel corpus is required to generate the Giza++ word alignment lists, no such data is required for embedding-based features where the only required inputs are two unrelated monolingual corpora and a small bilingual dictionary from which the embedding alignments are calculated.

Keywords:
Word (group theory) Terminology Embedding Bilingual dictionary Word embedding Cosine similarity

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.33
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Artificial Intelligence in Healthcare and Education
Health Sciences →  Medicine →  Health Informatics
Healthcare Technology and Patient Monitoring
Health Sciences →  Medicine →  Surgery
Technology Use by Older Adults
Social Sciences →  Social Sciences →  Demography

Related Documents

JOURNAL ARTICLE

Word-embedding based bilingual terminology alignment

Repar, AndražMartinc, MatejUlčar, MatejPollak, Senja

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2021
JOURNAL ARTICLE

Association-based bilingual word alignment

Robert C. Moore

Year: 2005 Pages: 1-1
BOOK-CHAPTER

Improving Word Alignment with Contextualized Embedding and Bilingual Dictionary

Minhan XuYu Hong

Communications in computer and information science Year: 2021 Pages: 180-194
JOURNAL ARTICLE

Word alignment based on bilingual bracketing

Bing ZhaoStephan Vogel

Year: 2003 Vol: 3 Pages: 15-18
JOURNAL ARTICLE

Word Alignment Based On Bilingual Bracketing

Zhao, BingVogel, Stephan

Journal:   KITopen Year: 2003
© 2026 ScienceGate Book Chapters — All rights reserved.