Effective management of cancer symptoms is pivotal for optimal clinical outcomes. This research aims to harness the potential of Electronic Health Records (EHRs), particularly unstructured clinical notes, as a rich data source for cancer symptom information. Given the complexity of extracting information from EHRs, we investigate the performance of various Large Language Models (LLMs), such as (Bidirectional Encoder Representations from Transformers) BERT and its variants, for cancer symptom identification. Using a carefully curated dataset of 1112 clinical notes annotated by experts for 13 prevalent cancer symptoms, we present a comparative analysis of the performance of models including BERT-based, Span BERT, Bio BERT, Clinical BERT, and PubMed BERT. Our findings unequivocally show that Clinical BERT outperforms other models, especially in metrics like precision, recall, and F1-score. This dominance of Clinical BERT underscores its potential to revolutionize cancer symptom management through EHRs, hinting at a brighter future for oncological research and improved treatment decision-making.
Isa SpieroMerijn H RijkMatthew A ScheeresFrans H. RuttenGeert-Jan GeersingTamara N PlatteelKarel G.M. MoonsLotty HooftJohanna AA DamenRoderick P VenekampArtuur Leeuwenberg
T. Elizabeth WorkmanAli M. AhmedHelen SheriffVenkatesh K. RamanSijian ZhangYijun ShaoCharles FaselisGregg C. FonarowQing Zeng‐Treitler
Anis YousefiNegin MastouriKamran Sartipi
G. RajeshR. MadhumithaSabarinath ArunagiriM.K. Vishal