A comprehensive study of deep learning compiler bugs

Qingchao Shen; Haoyang Ma; Junjie Chen; Yongqiang Tian; Shing-Chi Cheung; Xiang Chen

doi:10.1145/3468264.3468591

ScienceGate Book Chapters

JOURNAL ARTICLE

A comprehensive study of deep learning compiler bugs

Qingchao Shen Haoyang Ma Junjie Chen Yongqiang Tian Shing-Chi Cheung Xiang Chen

Year: 2021 Pages: 968-980

DOI: 10.1145/3468264.3468591

Get Full-Text PDF Get Analytical Report

Abstract

There are increasing uses of deep learning (DL) compilers to generate optimized code, boosting the runtime performance of DL models on specific hardware. Like their traditional counterparts, DL compilers can generate incorrect code, resulting in unexpected model behaviors that may cause catastrophic consequences in mission-critical systems. On the other hand, the DL models processed by DL compilers differ fundamentally from imperative programs in that the program logic in DL models is implicit. As such, various characteristics of the bugs arising from traditional compilers need to be revisited in the context of DL compilers. In this paper, we present the first systematic study of DL compiler bugs by analyzing 603 bugs arising in three popular DL compilers (i.e., TVM from Apache, Glow from Facebook, and nGraph from Intel). We analyzed these bugs according to their root causes, symptoms, and the stages where they occur during compilation. We obtain 12 findings, and provide a series of valuable guidelines for future work on DL compiler bug detection and debugging. For example, a large portion (nearly 20%) of DL compiler bugs are related to types, especially tensor types. The analysis of these bugs helps design new mutation operators (e.g., adding type cast for a tensor to promote implicit type conversion in subsequent tensor computations) to facilitate type-related bug detection. Further, we developed TVMfuzz as a proof-of-concept application of our findings to test the TVM DL compiler. It generates new tests based on TVM's original test suite. They expose 8 TVM bugs that are missed by the original test suite. The result demonstrates the usefulness of our findings. © 2021 ACM.

Keywords:

Compiler Computer science Programming language Boosting (machine learning) Deep learning Context (archaeology) Code (set theory) Code generation Parallel computing Artificial intelligence Operating system

Metrics

101

Cited By

22.82

FWCI (Field Weighted Citation Impact)

Refs

1.00

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Software Testing and Debugging Techniques

Physical Sciences → Computer Science → Software

Software Reliability and Analysis Research

Physical Sciences → Computer Science → Software

Software Engineering Research

Physical Sciences → Computer Science → Information Systems

A comprehensive study of deep learning compiler bugs

Abstract

Metrics

Citation History

Topics

Related Documents

Detecting Compiler Bugs Via a Deep Learning-Based Framework

The Deep Learning Compiler: A Comprehensive Survey

DeepDiffer: Find Deep Learning Compiler Bugs via Priority-guided Differential Fuzzing

Finding deep compiler bugs via guided stochastic program mutation

De-duplicating Silent Compiler Bugs via Deep Semantic Representation