Adversarial perturbations and deformations for convolutional neural networks
OPEN ACCESS
Loading...
Author / Producer
Date
2017-12-15
Publication Type
Master Thesis
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Rights / License
Abstract
We investigate the robustness of convolutional neural networks for image classification in terms of adversarial examples. We compare a widely used network for handwritten digit classification to a more robust classifier based on a feature extractor with predefined wavelet filters. In addition to generating adversarial examples by adding small perturbations to the input images, we introduce the concept of adversarial deformations of images. An algorithm for producing this new type of adversarial examples is established in the setting where the input signals belong to an infinite-dimensional Hilbert space. The method makes use of the Fréchet derivative of a functional on a space of (smooth) vector fields. To emulate a black box scenario, the derivatives of the classifier (needed to generate adversarial examples) are computed numerically. We demonstrate that a simple dimensionality reduction can significantly reduce the number of queries to the classifier necessary for this purpose.
Permanent link
Publication status
published
External links
Editor
Contributors
Examiner : Alaifari, Rima
Book title
Journal / series
Volume
Pages / Article No.
Publisher
ETH Zurich
Event
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Organisational unit
09603 - Alaifari, Rima (ehemalig) / Alaifari, Rima (former)
01411 - MSc Mathematik / MSc Mathematics