Multicenter Colonoscopy Quality Measurement Utilizing Natural Language Processing

Timothy D. Imler; Justin Morea; Charles J. Kahi; Huiping Xu; Cynthia Calley; Thomas F. Imperiale

doi:10.14309/00000434-201410002-02250

ScienceGate Book Chapters

JOURNAL ARTICLE

Multicenter Colonoscopy Quality Measurement Utilizing Natural Language Processing

Timothy D. Imler Justin Morea Charles J. Kahi Huiping Xu Cynthia Calley Thomas F. Imperiale

Year: 2014 Journal: The American Journal of Gastroenterology Vol: 109 Pages: S653-S653 Publisher: Lippincott Williams & Wilkins

DOI: 10.14309/00000434-201410002-02250

Get Full-Text PDF Get Analytical Report

Abstract

Introduction: An accurate system for tracking of both colonoscopy quality and surveillance intervals could improve the effectiveness and cost-effectiveness of colorectal cancer (CRC) screening and surveillance. The purpose of this study was to create and test such a system that uses natural language processing (NLP). Methods: From 42,569 colonoscopies with pathology records from 13 centers, we randomly sampled/identified 750 paired reports. We trained (n=250) and tested (n=500) an NLP-based program on 19 measurements that encompass colonoscopy quality measures and surveillance interval determination. Blinded, paired, annotated expert manual review was used as the reference standard. The remaining 41,819 non-annotated documents were processed through the NLP system without manual review to assess performance consistency. The primary outcome was system accuracy across the 19 measures. Results: The overall error rate was 3.5% in the NLP system vs. 1.9% for the paired annotators (p=0.041). When 8 vaguely documented reports were removed and annotator errors were corrected to correspond to established pathologic definitions, 25.4% were incorrect by NLP and 21.1% by the initial annotator (p=0.07). Rates of pathologic findings calculated based on the NLP system were similar to those calculated by the gold standard annotation for the majority of the measurements (see Table for partial listing). For the testing set, accuracy of CRC detection was 99.6%, advanced adenoma 95.0%, non-advanced adenoma 94.6%, advanced sessile serrated polyp 99.8%, non-advanced sessile serrated polyp 99.2%, ≥10 mm hyperplastic polyp 96.8%, and <10 mm hyperplastic polyp 96.0%. The lesion location showed high accuracy (87.0-99.8%). The number of adenomas had an accuracy of 90.2%. Adenoma detection rate across all documents and centers based on NLP was 29.1% (range 19.3-38.0%). Conclusion: NLP can accurately report adenoma detection rate and the components for determining guideline-adherent colonoscopy surveillance intervals across a wide variety of sites that utilize different methods for reporting colonoscopy findings. Disclosure - Dr. Imler - patent holder: TRAQ-ME. Dr. Morea - patent holder: TRAQ-ME.Table 1: Results Across Non-annotated Data Set Using NLP From Training and Testing Sets (Partial Listing)

Keywords:

Medicine Colonoscopy Gold standard (test) Artificial intelligence Adenoma Natural language processing Consistency (knowledge bases) Colorectal cancer Medical physics Radiology Computer science Cancer Pathology Internal medicine

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.39

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Colorectal Cancer Screening and Detection

Health Sciences → Medicine → Oncology

Multicenter Colonoscopy Quality Measurement Utilizing Natural Language Processing

Abstract

Metrics

Citation History

Topics

Related Documents

Multi-Center Colonoscopy Quality Measurement Utilizing Natural Language Processing

Accurate Identification of Colonoscopy Quality and Polyp Findings Using Natural Language Processing

Automated Heart Failure Quality Measurement with Natural Language Processing

Developing a natural language processing application for measuring the quality of colonoscopy procedures

Natural language processing as an alternative to manual reporting of colonoscopy quality metrics