Coverage-Guided Learning-Assisted Grammar-Based Fuzzing

Yuma Jitsunari; Yoshitaka Arahori

doi:10.1109/icstw.2019.00065

ScienceGate Book Chapters

JOURNAL ARTICLE

Coverage-Guided Learning-Assisted Grammar-Based Fuzzing

Yuma Jitsunari Yoshitaka Arahori

Year: 2019 Pages: 275-280

DOI: 10.1109/icstw.2019.00065

Get Full-Text PDF Get Analytical Report

Abstract

Grammar-based fuzzing is known to be an effective technique for checking security vulnerabilities in programs, such as parsers, which take complex structured inputs. Unfortunately, most of existing grammar-based fuzzers require a lot of manual efforts of writing complex input grammars, which hinders their practical use. To address this problem, recently proposed approaches use machine learning to automatically acquire a generative model for structured inputs conforming to a complex grammar. Even such approaches, however, have major limitations: they fail to learn a generative model for instruction sequences, and they cannot achieve good coverage of instruction-parsing code. To overcome such limitations. this paper proposes a collection of techniques for enhancing learning-assisited grammar-based fuzzing. Our approach allows for the learning of a generative model for instruction sequences by training a hybrid character/token-level recursive neural network. In addition, we exploit coverage metrics gathered during previous runs of fuzzing in order to efficiently refine (or fine-tune) the learnt model so that it can make high coverage-inducing new inputs. Our experiments with a real PDF parser show that our approach succeeded in generating new sequences of instructions (in PDF page streams) that induce better code coverage (of the PDF parser) than state-of-the-art learning-assisted grammar-based fuzzers.

Keywords:

Fuzz testing Computer science Parsing Generative grammar Grammar Artificial intelligence Rule-based machine translation Programming language Security token Exploit Natural language processing Machine learning Software

Metrics

Cited By

0.33

FWCI (Field Weighted Citation Impact)

Refs

0.54

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Malware Detection Techniques

Physical Sciences → Computer Science → Signal Processing

Software Testing and Debugging Techniques

Physical Sciences → Computer Science → Software

Adversarial Robustness in Machine Learning

Physical Sciences → Computer Science → Artificial Intelligence

Coverage-Guided Learning-Assisted Grammar-Based Fuzzing

Abstract

Metrics

Citation History

Topics

Related Documents

Quantifying the Limitations of Learning-Assisted Grammar-Based Fuzzing

DeepCov: Coverage Guided Deep Learning Framework Fuzzing

Coverage-Guided Fuzzing for Plan-Based Robotics

Python Coverage Guided Fuzzing for Deep Learning Framework

Coverage-guided fuzzing for deep reinforcement learning systems