Coverage-guided fuzzing for deep reinforcement learning systems

Xiaohui Wan; Tiancheng Li; Weibin Lin; Yi Cai; Zheng Zheng

doi:10.1016/j.jss.2024.111963

ScienceGate Book Chapters

JOURNAL ARTICLE

Coverage-guided fuzzing for deep reinforcement learning systems

Xiaohui Wan Tiancheng Li Weibin Lin Yi Cai Zheng Zheng

Year: 2024 Journal: Journal of Systems and Software Vol: 210 Pages: 111963-111963 Publisher: Elsevier BV

DOI: 10.1016/j.jss.2024.111963

Get Full-Text PDF Get Analytical Report

Abstract

While the past decade has witnessed a growing demand for employing deep reinforcement learning (DRL) in various domains to solve real-world problems, the reliability of DRL systems has become more of a concern. In particular, DRL agents are often trained on data from a potentially biased distribution over environmental settings, causing the trained agents to fail in certain cases despite high average-case performance. Hence, it is necessary and urgent to adequately test DRL agents to ensure the reliability of practical DRL systems. However, due to the fundamental difference in the programming paradigm and the development process, traditional software testing methodology cannot be applied directly to DRL systems. Given that, we introduce a novel testing framework for DRL systems, aiming to generate diverse test cases that can drive a DRL system to fail. Specifically, we design, implement and evaluate DRLFuzz, which is a coverage-guided fuzzing (CGF) framework for systematically testing DRL systems. Experimental results demonstrate that DRLFuzz can efficiently discover diverse failures in different DRL systems for various benchmark tasks. Compared with a random search baseline, DRLFuzz can generate 60% more failed cases in general. Additionally, the diversity of failed cases generated by DRLFuzz is increased by 4.6%∼14.1% in terms of mean pairwise distance (MPD). Furthermore, our experiments also indicate that the failed cases generated by DRLFuzz can be utilized to fine-tune the DRL agent to eliminate the failures resulting from inadequate exploration during training and thus improve the reliability of DRL systems.

Keywords:

Fuzz testing Reinforcement learning Computer science Reliability (semiconductor) Pairwise comparison Benchmark (surveying) Artificial intelligence Machine learning Reliability engineering Software Engineering Cartography

Metrics

Cited By

5.75

FWCI (Field Weighted Citation Impact)

Refs

0.93

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Reinforcement Learning in Robotics

Physical Sciences → Computer Science → Artificial Intelligence

Software Testing and Debugging Techniques

Physical Sciences → Computer Science → Software

Viral Infectious Diseases and Gene Expression in Insects

Life Sciences → Biochemistry, Genetics and Molecular Biology → Molecular Biology

Coverage-guided fuzzing for deep reinforcement learning systems

Abstract

Metrics

Citation History

Topics

Related Documents

Coverage-Guided Fuzzing for Deep Reinforcement Learning Systems

DeepCov: Coverage Guided Deep Learning Framework Fuzzing

Deep2Fuzz: Coverage-Guided Reinforcement Fuzzing towards Deep Neural Networks

Python Coverage Guided Fuzzing for Deep Learning Framework

RLGFuzz: Reinforcement Learning Guided Fuzzing with State-Coverage Mapping Environment