Studying and Protecting Internet Privacy: The Case of Internet Filtering and User Profiling
Either through a deliberate desire for surveillance or consequence of design, there is a growing number of systems and applications that record and process sensitive and personal information. Consequently, measuring, studying, and protecting Internet privacy becomes increasingly crucial.
For instance, many governments worldwide monitor and censor Internet traffic but, due to the lack of publicly available information as well as the inherent risks of performing active measurements, it is often hard to analyze these practices in the wild. Therefore, the leak of 600GB worth of logs from 7 Syrian filtering proxies represents a unique opportunity to provide a detailed snapshot of a real-world censorship ecosystem.
In this talk, we present the methodology and the results of a measurement-based analysis of the logs, uncovering a relatively stealthy, yet quite targeted, censorship, and show that traffic in Syria was filtered in several ways: using IP addresses and domain names to block subnets or websites, and keywords or categories to target specific content.
Next, we focus on targeted advertising, a practice increasingly used by a growing number of online service providers. These collect large amounts of personal information and use it to build user profiles and monetize them with advertisers and data brokers. Users have little control of what information is processed and are often left with an "all-or-nothing" decision between receiving free services or refusing to be profiled.
This talk explores an alternative approach where users only disclose an aggregate model -- the "gist" -- of their data. We aim to preserve data utility and simultaneously provide user privacy, by letting users contribute encrypted and differentially-private data to an aggregator. The aggregator combines encrypted contributions and can only extract an aggregate model of the underlying data. We evaluate our framework on a dataset of 100,000 U.S. users obtained from the U.S. Census Bureau and show that (i) it provides accurate aggregates with as little as 100 users, (ii) it can generate revenue for both users and data brokers, and (iii) its overhead is appreciably low.
Emiliano De Cristofaro is a Senior Lecturer (~Associate Professor) at University College London (UCL). Prior to joining UCL in 2013, he was a Research Scientist at PARC (formerly known as Xerox PARC). In 2011, he received a PhD in Networked Systems (and Surfing) from the University of California, Irvine, advised by Gene Tsudik, and, in 2005 a B.Sc. in Computer Science (and Pizza) from University of Salerno, Italy. His research interests include privacy, security, and applied cryptography. In 2011, he received the Dean's Dissertation Fellowship from UC Irvine and, in 2012, the Excellency Award from PARC's Computer Science Lab. In 2013-2014, he co-chaired the Privacy Enhancing Technologies Symposium (PETS) -- a fact that took two years away from his life expectancy. His homepage is available at http://emilianodc.com