JOURNAL ARTICLE

Cross-lingual Sentiment Analysis of Code-Mixed Corpus based on Cross-lingual Word Embedding

Abstract

Bilingualism is a common linguistic phenomenon that causes a challenge in opinion mining. The early methods in Cross-lingual Sentiment Analysis (CLSA), based on machine translation, parallel corpus, and bilingual sentiment lexicon, face issues in terms of translation error, vocabulary coverage, and dependence on extensive parallel data. Hence, this study examined the effectiveness of Cross-lingual Word Embedding (CLWE) for the sentiment analysis of code-mixed Filipino-English corpus. A large-scale manually annotated code-mixed dataset containing stakeholders' feedback on the Higher Education Institutions' services and infrastructure was developed to address resource scarcity. Several pre-trained transformer-based CLWE methods, such as mBERT, XLM-R, and XLM-T, were employed to represent the words from the two languages in the same vector space and obtain the cross-lingual embeddings. An Attention-based BiLSTM-CNN neural architecture, the baseline model from the previous work, was fine-tuned on these cross-lingual embeddings to perform the sentiment analysis of code-mixed Filipino-English corpus. The experimental results demonstrate that XLM-T has achieved the highest performance rate, with 91.30% accuracy, 90.36% precision, 90.92% recall, and 90.61% F1-score. Thus, employing cross-lingual word embedding was proven effective as it significantly increases the accuracy by up to 10.02% compared to the baseline model, which only uses word embedding having no cross-lingual alignment.

Keywords:
Computer science Natural language processing Word embedding Artificial intelligence Sentiment analysis Word (group theory) Code (set theory) Embedding Linguistics

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
35
Refs
0.21
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Sentiment Analysis and Opinion Mining
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.