Open access
Author
Date
2021Type
- Doctoral Thesis
ETH Bibliography
yes
Altmetrics
Abstract
With the rise of modern information technology and the Internet, the worldwide interconnectivity is resulting in a massive collection and evaluation of potentially sensitive data, often out of control of those affected. The increasing impact of this data stream and the potential for its abuse raise concern, calling for protection against emerging exploitations and fear-driven self-censorship. The ability of individuals or a group to limit this flow and to express themselves selectively is commonly subsumed under the umbrella term \textit{privacy}. This thesis tackles the digital generation, processing, and control of personal information, so-called individual data privacy, from multiple angles. First, it introduces the concept of passive participation, enabling users to access information over the Internet while hiding in cover traffic passively generated by regular users of frequently visited websites. This solves the bootstrapping problem for mid- and high-latency anonymous communication networks where an adversary might collect thousands of traffic observations. Next, we analyze the statistical privacy leakage of multiple such sequential adversarial observations in the information-theoretic framework of differential privacy that aims to limit and blur the impact of individuals. There, we propose the privacy loss distribution, unifying several other often used differential privacy notions, and show that it converges towards a Gaussian shape under independent sequential composition of observations, allowing the classification of differentially private mechanisms into privacy loss classes defined by the parameters of said Gaussian distribution. However, more blurring means less accurate results, the inherent privacy-utility trade-off. We applied a gradient descent optimizer and learned utility-loss-minimizing truncated noise patterns for differentially private mechanisms that blur the impact of individuals by adding the learned noise to sensitivity-bounded outputs. Our results suggest that Gaussian additive noise is close to optimal, especially under sequential composition. Finally, we tackle the trust problem in truthfully executed deletion requests for personal data and provide a framework for probabilistic verification of such requests while demonstrating its feasibility for the case of machine learning. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000508911Publication status
publishedExternal links
Search print copy at ETH Library
Contributors
Examiner: Capkun, Srdjan
Examiner: Mittal, Prateek
Examiner: Mohammadi, Esfandiar
Examiner: Papernot, Nicolas
Examiner: Vechev, Martin
Publisher
ETH ZurichSubject
Differential privacy; Machine learning; Anonymous communicationOrganisational unit
03755 - Capkun, Srdan / Capkun, Srdan
More
Show all metadata
ETH Bibliography
yes
Altmetrics