Open access
Date
2024-05Type
- Conference Paper
ETH Bibliography
yes
Altmetrics
Abstract
Investigating how websites use sensitive user data is an active research area. However, research based on automated measurements has been limited to those websites that do not require user authentication. To overcome this limitation, we developed a crawler that automates website registrations and newsletter subscriptions and detects both security and privacy threats at scale.
We demonstrate our crawler's capabilities by running it on 660k websites. We use this to identify security and privacy threats and to contextualize them within EU laws, namely the General Data Protection Regulation and ePrivacy Directive. Our methods detect private data collection over insecure HTTP connections and websites sending emails with user-provided passwords. We are also the first to apply machine learning to web forms, assessing violations of marketing consent collection requirements. Overall, we find that 37.2% of websites send marketing emails without proper user consent. This is mostly caused by websites failing both to verify and store consent adequately. Additionally, 1.8% of websites share users' email addresses with third parties without a transparent disclosure. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000674024Publication status
publishedExternal links
Book title
WWW '24: Proceedings of the ACM on Web Conference 2024Pages / Article No.
Publisher
Association for Computing MachineryEvent
Subject
Crawling; Registration; Consent; GDPR; ePrivacy; ComplianceOrganisational unit
03634 - Basin, David / Basin, David
Related publications and datasets
More
Show all metadata
ETH Bibliography
yes
Altmetrics