Aiming at the situation that the existing Seq2Seq model is prone to deviate from the original meaning, lose contextual semantic information, and fabricate facts when generating news Summarization. This paper, based on the improvement of pointer generation network model, proposes a hybrid pointer generation network model based on BART pre-training language model. Firstly, the BART pre-trained language model based on denoising encoder is used to extract deep semantic features, and these semantic features are fused with the original news text and input into the pointer generation network model with coverage mechanism. The Bi-LSTM model is used to encode, and then the LSTM model with attention mechanism is used to decode the semantic vector to get the summarization content. In the decoding process, pointer mechanism solves the problem of out-of-vocabulary (OOV), coverage mechanism alleviates the problem of generating repeated words, and Beam search is used to improve the accuracy of summarization generation. The experimental results on NLPCC2017 data set show that the BART-PGN summarization model proposed in this paper obtains a higher Rouge score, which confirming its effectiveness in the field of Chinese news summarization.
Tao DongShimin ShanYu LiuYue QianAnqi Ma
Dang Trung Duc AnhNguyễn Thị Thu Trang
Rini WijayantiMasayu Leylia KhodraDwi H. Widyantoro
Xiaoping JiangPo HuLiwei HouWang Xia