JOURNAL ARTICLE

Chinese symptom component recognition via bidirectional LSTM-CRF

Abstract

The descriptions of Chinese symptoms are rich and varied, and the components of Chinese symptoms are complex and changeable. As an important step to transform unstructured electronic medical records into structured ones, the recognition of Chinese symptom components is helpful to fully grasp the information which a symptom brings. What's more, it is also the foundation of symptom standardization as well as condition quantification. In this paper, we first propose a model of Chinese symptom composition, which classifies symptom components into eleven types, such as atomsymptoms, body parts, and headwords. Then we regard the component recognition task as a sequence labeling problem. We use Bidirectional LSTM-CRF along with part-of-speech features and data augmentation to solve the problem. Experiments show that our method achieves the best performance, with the Accuracy of 92.77% and 94.34% in symptom and component level, respectively. The results are 20.72% and 14.42% higher than the base model.

Keywords:
Component (thermodynamics) Computer science Standardization Task (project management) Artificial intelligence GRASP Pattern recognition (psychology) Natural language processing Sequence labeling Speech recognition Engineering

Metrics

3
Cited By
0.20
FWCI (Field Weighted Citation Impact)
20
Refs
0.57
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Traditional Chinese Medicine Studies
Health Sciences →  Medicine →  Complementary and alternative medicine
Biomedical Text Mining and Ontologies
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
© 2026 ScienceGate Book Chapters — All rights reserved.