We present our submission to the WMT23 shared task in translation between English and Assamese, Khasi, Mizo and Manipuri. All our systems were pretrained on the task of multilingual masked language modelling and denoising auto-encoding. Our primary systems for translation into English were further pretrained for multilingual MT in all four language directions and fine-tuned on the limited parallel data available for each language pair separately. We used online back-translation for data augmentation. The same systems were submitted as contrastive for translation out of English as the multilingual MT pretraining step seemed to harm the translation performance. Our primary systems for translation out of English were trained without the multilingual MT pretraining step. Other contrastive systems used additional pseudo-parallel data mined from monolingual corpora for pretraining.
Dhairya SumanAtanu MandalSantanu PalSudip Kumar Naskar
P. M. AbhinavKetaki ShetyeParameswari Krishnamurthy
Chang JuJunpeng LiuKaiyu HuangDegen Huang