In the task of Chinese Named Entity Recognition(NER), the long short-term memory network model with cyclic structure can solve the problem of long-distance dependence by capturing temporal features, but its feature capture method is singular and the information acquisition ability is limited. By using multi-layer convolution to process text in parallel, the Convolutional Neural Network(CNN) can improve the operation speed of the model and capture the spatial features of text. However, simply stacking multiple convolutional layers can easily lead to the gradient vanishing problem.To obtain multi-dimensional text features simultaneously and improve the gradient vanishing problem, this paper proposes a Chinese NER model based on RoBERTa-wwm-DGCNN-BiLSTM-BMHA-CRF.Firstly, text is represented as a character-level embedding vector by the pre-trained language model RoBERTa-wwm based on the whole-word masking technique to capture the deep contextual semantic information.Secondly, the gating mechanism and residual structure are used to improve the Dilated CNN(DCNN) to reduce the risk of gradient disappearance, and then the Bi-directional Long Short-Term Memory(BiLSTM) network and Dilated Gated CNN(DGCNN) are used to capture the temporal and spatial characteristics of the text, respectively. Thirdly, the Bi-linear Multi-Head Attention (BMHA) mechanism is used to dynamically fuse the multi-dimensional text features. Finally, the Conditional Random Field(CRF) is used to constrain the results and obtain the best marker sequence. The experimental results indicate that the F1 values of the proposed model on the Resume, Weibo, and MSRA data sets were 97.20%, 74.28% and 95.74%, respectively, which proves the effectiveness of the proposed model for Chinese NER.
Zhenxiang SunRunyuan SunZhifeng LiangZhuang SuYongxin YuShuainan Wu
Chaoyi WanRuiqin WangQishun JiYimin Huang
Chao DuXuhong LiuLin MiaoXiulei Liu