In this paper, we address the Event Detection task under a zero-shot cross-lingual setting where a model is trained on a source language but evaluated on a distinct target language for which there is no labeled data available. Most recent efforts in this field follow a direct transfer approach in which the model is trained using language-invariant features and then directly applied to the target language. However, we argue that these methods fail to take advantage of the benefits of the data transfer approach where a cross-lingual model is trained on target-language data and is able to learn task-specific information from syntactical features or word-label relations in the target language. As such, we propose a hybrid knowledge-transfer approach that leverages a teacher-student framework where the teacher and student networks are trained following the direct and data transfer approaches, respectively. Our method is complemented by a hierarchical training-sample selection scheme designed to address the issue of noisy labels being generated by the teacher model. Our model achieves state-of-the-art results on 9 morphologically-diverse target languages across 3 distinct datasets, highlighting the importance of exploiting the benefits of hybrid transfer.
Jiaqian RenHao PengLei JiangZhifeng HaoJia WuShengxiang GaoZhengtao YuQiang Yang
Yabing WangFan WangJianfeng DongHao Luo
Juuso EronenMichał PtaszyńskiFumito MasuiMasaki ArataGniewosz LeliwaMichał Wroczyński
Hansi HettiarachchiMariam Adedoyin-OloweJagdev BhogalMohamed Medhat Gaber
Luis Guzman-NaterasMinh Van NguyenThien Huu Nguyen