The defense against adversarial attacks was originally proposed for computer vision, and recently such an adversarial training (AT) has been emerging for natural language understanding. In an AT process, the adversarial perturbations are added on the input word embeddings as the noisy data which are included to allow the trained model to be noise invariant and accordingly improve the model generalization. However, the performance of existing works was bounded under the supervised or semi-supervised setting. In addition, the contrastive learning (CL) has obtained a significant performance in a self-supervised pre-training for language models. This paper presents a novel method to re-formulate CL to meet a self-supervised classification objective. Using this new formula, a self-supervised AT method is proposed for training an efficient sentence encoder. Experiments show that the pro-posed CL can improve the previous methods to find unsupervised sentence embeddings. With the help of AT, this method further surpasses the previous supervised methods.
Javad Rafiei AslPrajwal PanzadeEduardo BlancoDaniel TakabiZhipeng Cai
Peerat LimkonchotiwatWuttikorn PonwitayaratLalita LowphansirikulCan UdomcharoenchaikitEkapol ChuangsuwanichSarana Nutanong
Bo‐Hyun YunDahye KimY.-J. KimYoung-Seob Jeong