On Detecting Adversarial Perturbations

Jan Hendrik Metzen; Tim Genewein; Volker Fischer; Bastian Bischoff

doi:10.48550/arxiv.1702.04267

ScienceGate Book Chapters

JOURNAL ARTICLE

On Detecting Adversarial Perturbations

Jan Hendrik Metzen Tim Genewein Volker Fischer Bastian Bischoff

Year: 2017 Journal: arXiv (Cornell University) Publisher: Cornell University

DOI: 10.48550/arxiv.1702.04267

Get Full-Text PDF Get Analytical Report

Abstract

Machine learning and deep learning in particular has advanced tremendously on perceptual tasks in recent years. However, it remains vulnerable against adversarial perturbations of the input that have been crafted specifically to fool the system while being quasi-imperceptible to a human. In this work, we propose to augment deep neural networks with a small "detector" subnetwork which is trained on the binary classification task of distinguishing genuine data from data containing adversarial perturbations. Our method is orthogonal to prior work on addressing adversarial perturbations, which has mostly focused on making the classification network itself more robust. We show empirically that adversarial perturbations can be detected surprisingly well even though they are quasi-imperceptible to humans. Moreover, while the detectors have been trained to detect only a specific adversary, they generalize to similar and weaker adversaries. In addition, we propose an adversarial attack that fools both the classifier and the detector and a novel training procedure for the detector that counteracts this attack.

Keywords:

Adversarial system Subnetwork Computer science Adversary Artificial intelligence Classifier (UML) Detector Adversarial machine learning Binary classification Deep neural networks Machine learning Task (project management) Deep learning Binary number Computer security Mathematics Support vector machine Engineering

Metrics

219

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Adversarial Robustness in Machine Learning

Physical Sciences → Computer Science → Artificial Intelligence

Anomaly Detection Techniques and Applications

Physical Sciences → Computer Science → Artificial Intelligence

On Detecting Adversarial Perturbations

Abstract

Metrics

Citation History

Topics

Related Documents

Detecting Adversarial Perturbations with Saliency

Detecting Adversarial Perturbations with Salieny

Cassandra: Detecting Trojaned Networks From Adversarial Perturbations

Detecting Adversarial Perturbations in Multi-Task Perception

Detecting Adversarial Perturbations with Pre-trained Models