JOURNAL ARTICLE

Automated annotation and classification of BI-RADS assessment from radiology reports

Abstract

The Breast Imaging Reporting and Data System (BI-RADS) was developed to reduce variation in the descriptions of findings. Manual analysis of breast radiology report data is challenging but is necessary for clinical and healthcare quality assurance activities. The objective of this study is to develop a natural language processing (NLP) system for automated BI-RADS categories extraction from breast radiology reports. We evaluated an existing rule-based NLP algorithm, and then we developed and evaluated our own method using a supervised machine learning approach. We divided the BI-RADS category extraction task into two specific tasks: (1) annotation of all BI-RADS category values within a report, (2) classification of the laterality of each BI-RADS category value. We used one algorithm for task 1 and evaluated three algorithms for task 2. Across all evaluations and model training, we used a total of 2159 radiology reports from 18 hospitals, from 2003 to 2015. Performance with the existing rule-based algorithm was not satisfactory. Conditional random fields showed a high performance for task 1 with an F-1 measure of 0.95. Rules from partial decision trees (PART) algorithm showed the best performance across classes for task 2 with a weighted F-1 measure of 0.91 for BIRADS 0-6, and 0.93 for BIRADS 3-5. Classification performance by class showed that performance improved for all classes from Naïve Bayes to Support Vector Machine (SVM), and also from SVM to PART. Our system is able to annotate and classify all BI-RADS mentions present in a single radiology report and can serve as the foundation for future studies that will leverage automated BI-RADS annotation, to provide feedback to radiologists as part of a learning health system loop.

Keywords:
BI-RADS Artificial intelligence Computer science Support vector machine Naive Bayes classifier Machine learning Breast imaging Task (project management) Annotation Decision tree Conditional random field Natural language processing Medicine Breast cancer Mammography Cancer

Metrics

85
Cited By
3.68
FWCI (Field Weighted Citation Impact)
73
Refs
0.93
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Global Cancer Incidence and Screening
Health Sciences →  Medicine →  Oncology
AI in cancer detection
Physical Sciences →  Computer Science →  Artificial Intelligence
Biomedical Text Mining and Ontologies
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
© 2026 ScienceGate Book Chapters — All rights reserved.