Fighting Uphill Battles: Improvements in Personal Data Privacy
OPEN ACCESS
Loading...
Author / Producer
Date
2021
Publication Type
Doctoral Thesis
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Rights / License
Abstract
With the rise of modern information technology and the Internet, the worldwide interconnectivity is resulting in a massive collection and evaluation of potentially sensitive data, often out of control of those affected. The increasing impact of this data stream and the potential for its abuse raise concern, calling for protection against emerging exploitations and fear-driven self-censorship. The ability of individuals or a group to limit this flow and to express themselves selectively is commonly subsumed under the umbrella term \textit{privacy}. This thesis tackles the digital generation, processing, and control of personal information, so-called individual data privacy, from multiple angles. First, it introduces the concept of passive participation, enabling users to access information over the Internet while hiding in cover traffic passively generated by regular users of frequently visited websites. This solves the bootstrapping problem for mid- and high-latency anonymous communication networks where an adversary might collect thousands of traffic observations. Next, we analyze the statistical privacy leakage of multiple such sequential adversarial observations in the information-theoretic framework of differential privacy that aims to limit and blur the impact of individuals. There, we propose the privacy loss distribution, unifying several other often used differential privacy notions, and show that it converges towards a Gaussian shape under independent sequential composition of observations, allowing the classification of differentially private mechanisms into privacy loss classes defined by the parameters of said Gaussian distribution. However, more blurring means less accurate results, the inherent privacy-utility trade-off. We applied a gradient descent optimizer and learned utility-loss-minimizing truncated noise patterns for differentially private mechanisms that blur the impact of individuals by adding the learned noise to sensitivity-bounded outputs. Our results suggest that Gaussian additive noise is close to optimal, especially under sequential composition. Finally, we tackle the trust problem in truthfully executed deletion requests for personal data and provide a framework for probabilistic verification of such requests while demonstrating its feasibility for the case of machine learning.
Permanent link
Publication status
published
External links
Editor
Contributors
Examiner : Capkun, Srdjan
Examiner : Mittal, Prateek
Examiner : Mohammadi, Esfandiar
Examiner : Papernot, Nicolas
Examiner : Vechev, Martin
Book title
Journal / series
Volume
Pages / Article No.
Publisher
ETH Zurich
Event
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Differential privacy; Machine learning; Anonymous communication
Organisational unit
03755 - Capkun, Srdan / Capkun, Srdan