Juuso EronenMichał PtaszyńskiFumito Masui
The goal of our research is to develop an objective, scientifically backed method, for cross-lingual transfer language selection. We propose a method that relies on linguistic similarity metrics to evaluate the distance between languages and identify the optimal transfer language, rather than solely relying on intuition. Our findings indicate that linguistic similarity is a strong predictor of cross-lingual transfer performance across all Natural Language Processing tasks used, which include abusive language detection, sentiment analysis, named entity recognition, and dependency parsing. Moreover, we observe a statistically significant difference in performance when selecting a transfer source language other than English. This approach facilitates the selection of a more suitable transfer language, which aids in leveraging knowledge from high-resource languages and enhances the performance of low resource language applications. Our study utilized datasets from eight languages belonging to three language families.
Juuso EronenMichał PtaszyńskiFumito Masui