Adversarial perturbations and deformations for convolutional neural networks


Loading...

Author / Producer

Date

2017-12-15

Publication Type

Master Thesis

ETH Bibliography

yes

Citations

Altmetric

Data

Abstract

We investigate the robustness of convolutional neural networks for image classification in terms of adversarial examples. We compare a widely used network for handwritten digit classification to a more robust classifier based on a feature extractor with predefined wavelet filters. In addition to generating adversarial examples by adding small perturbations to the input images, we introduce the concept of adversarial deformations of images. An algorithm for producing this new type of adversarial examples is established in the setting where the input signals belong to an infinite-dimensional Hilbert space. The method makes use of the Fréchet derivative of a functional on a space of (smooth) vector fields. To emulate a black box scenario, the derivatives of the classifier (needed to generate adversarial examples) are computed numerically. We demonstrate that a simple dimensionality reduction can significantly reduce the number of queries to the classifier necessary for this purpose.

Publication status

published

External links

Editor

Contributors

Examiner : Alaifari, Rima

Book title

Journal / series

Volume

Pages / Article No.

Publisher

ETH Zurich

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Organisational unit

09603 - Alaifari, Rima (ehemalig) / Alaifari, Rima (former) check_circle
01411 - MSc Mathematik / MSc Mathematics

Notes

Funding

Related publications and datasets