The cost of fine-tuning has increased significantly in recent years as the size of language model parameters has increased. Prompt-tuning and adapters have made it possible to train models with a small number of parameters to obtain results similar to those of fine-tuning methods. However, most of the current prompt-tuning methods require the help of hand-crafted templates and verbalizers to achieve outstanding results in few-shot learning. In this work, we propose PPM, Prompt-free prompt-tuning for multi-task learning. First, we insert the task-specific adapter into the pre-trained language model to replace the hand-designed external template. Then, we train each adapter separately on different tasks and adjust the parameters of each adapter layer. Next, we combine the different adapters and draw on their valid knowledge by tuning the parameters of the fusion part to get the smallest loss function in the process of extracting knowledge from different adapters. To boost the training speed, we use Post-LN to replace Pre-LN, which switched the position of the Laynorm layer in the model from after the two Addition layers to before the FFN layer and the Multi-head Attention layer. Experimental results on different NLP tasks show that our model has better synergistic effects on diverse types of downstream tasks.
Jingwei ZhangSaarthak KapseKe MaPrateek PrasannaJoel SaltzMaria VakalopoulouDimitris Samaras
Jingping LiuTao ChenZujie LiangHaiyun JiangYanghua XiaoWei FengYuxi QianZhenghong HaoBing Han
Ting BaiLe HuangYue YuCheng YangChao-Ju HouZhe ZhaoChuan Shi
Yifan WangYixin CaoZeping LiBowen DongGuangnan YeHongfeng Chai