Open access
Author
Date
2023-11Type
- Master Thesis
ETH Bibliography
yes
Altmetrics
Abstract
The widespread practices of data collection and tracking on the internet drive the business models of numerous web services, which was demonstrated by numerous prior research. Despite the emergence of privacy regulations and technologies, their impact on data collection practices remains understudied. In a continuously evolving online landscape, unlike long-term studies, one-time web measurements have several limitations in their meaningfulness.
The majority of tracking studies require collecting specific data that existing internet archiving initiatives do omit. In contrast, the Privacy Observatory introduced in this thesis orchestrates long-term, regular crawls and measurements. We reimplement five influential privacy measurement studies, and evaluate their reproducibility, by using theoretical criteria from prior work, which we show are not guaranteeing practical aspects of reproducibility. Our approach relies on containerised Docker images and standardised input/output interfaces, which not only facilitates study replication but also reveals six fundamental principles crucial for ensuring the long-term replicability of such studies by future researchers.
By reimplementing these studies on the Privacy Observatory and executing them regularly, we enable continuous observation of trends in privacy regulation compliance within an immutable execution environment, offering insights into long-term developments in internet privacy practices. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000662341Publication status
publishedPublisher
ETH ZurichOrganisational unit
03634 - Basin, David / Basin, David
More
Show all metadata
ETH Bibliography
yes
Altmetrics