Classification of Current Procedural Terminology Codes from Electronic Health Record Data Using Machine Learning

Michael Burns; Michael R. Mathis; John Vandervest; Xinyu Tan; Bo Lu; Douglas A. Colquhoun; Nirav Shah; Sachin Kheterpal; Leif Saager

doi:10.1097/aln.0000000000003150

JOURNAL ARTICLE

Classification of Current Procedural Terminology Codes from Electronic Health Record Data Using Machine Learning

Michael Burns Michael R. Mathis John Vandervest Xinyu Tan Bo Lu Douglas A. Colquhoun Nirav Shah Sachin Kheterpal Leif Saager

Year: 2020 Journal: Anesthesiology Vol: 132 (4)Pages: 738-749 Publisher: Lippincott Williams & Wilkins

DOI: 10.1097/aln.0000000000003150

Get Full-Text PDF Get Analytical Report

Abstract

Abstract Background Accurate anesthesiology procedure code data are essential to quality improvement, research, and reimbursement tasks within anesthesiology practices. Advanced data science techniques, including machine learning and natural language processing, offer opportunities to develop classification tools for Current Procedural Terminology codes across anesthesia procedures. Methods Models were created using a Train/Test dataset including 1,164,343 procedures from 16 academic and private hospitals. Five supervised machine learning models were created to classify anesthesiology Current Procedural Terminology codes, with accuracy defined as first choice classification matching the institutional-assigned code existing in the perioperative database. The two best performing models were further refined and tested on a Holdout dataset from a single institution distinct from Train/Test. A tunable confidence parameter was created to identify cases for which models were highly accurate, with the goal of at least 95% accuracy, above the reported 2018 Centers for Medicare and Medicaid Services (Baltimore, Maryland) fee-for-service accuracy. Actual submitted claim data from billing specialists were used as a reference standard. Results Support vector machine and neural network label-embedding attentive models were the best performing models, respectively, demonstrating overall accuracies of 87.9% and 84.2% (single best code), and 96.8% and 94.0% (within top three). Classification accuracy was 96.4% in 47.0% of cases using support vector machine and 94.4% in 62.2% of cases using label-embedding attentive model within the Train/Test dataset. In the Holdout dataset, respective classification accuracies were 93.1% in 58.0% of cases and 95.0% among 62.0%. The most important feature in model training was procedure text. Conclusions Through application of machine learning and natural language processing techniques, highly accurate real-time models were created for anesthesiology Current Procedural Terminology code classification. The increased processing speed and a priori targeted accuracy of this classification approach may provide performance optimization and cost reduction for quality improvement, research, and reimbursement tasks reliant on anesthesiology procedure codes. Editor’s Perspective What We Already Know about This Topic What This Article Tells Us That Is New

Keywords:

Current Procedural Terminology Machine learning Artificial intelligence Terminology Medicine Anesthesiology Reimbursement Computer science Test (biology) Test data Data mining Natural language processing Health care Surgery

Metrics

Cited By

2.04

FWCI (Field Weighted Citation Impact)

Refs

0.87

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Cardiac, Anesthesia and Surgical Outcomes

Health Sciences → Medicine → Cardiology and Cardiovascular Medicine

Machine Learning in Healthcare

Physical Sciences → Computer Science → Artificial Intelligence

Electronic Health Records Systems

Health Sciences → Health Professions → Health Information Management

Classification of Current Procedural Terminology Codes from Electronic Health Record Data Using Machine Learning

Abstract

Metrics

Citation History

Topics

Related Documents

Current procedural terminology codes

Machine learning functional impairment classification with electronic health record data

Using Machine Learning to Identify Health Outcomes from Electronic Health Record Data

Inferring high-fat dietary patterns from electronic health record data using machine learning

Comparison of machine-learning algorithms for the prediction of Current Procedural Terminology (CPT) codes from pathology reports