Return to search

Measuring the Privacy Risks and Value of Web Tracking / Analyser les risques sur la vie privée et l'économie du profilage WEB

Les nouvelles technologies introduisent de nouveaux problèmes et risques. Par exemple, les internautes sont constamment tracés et profilés sur l'Internet. Ce profilage permet aux divers sites de personaliser et ainsi d'améliorer le service qu'ils fournissent à chaque internaute. Cependant ce profilage introduit aussi des problèmes d'intimité et de protection de la vie privée. Il est d'ailleurs reconnu que ces données personnelles sont souvent échangées, voire vendues, et qu'il existe une vraie economie des données personnelles. Cet thèse étudie comment ces données personnelles, et en particulier les historique Web - c'est à dire la liste des sites Internet visités par un internaute-, sont collectées, échangées et vendues. Elle propose une analyse de la vie privée des systèmes de vente aux enchères des publicités ciblés. Elle montre comment les différents acteurs de la publicité en ligne collectent et s'echangent les données personnelles, et étudie les risques pour les Internautes. Elle propose également une analyse économique et montre, notamment, que les données sont bradées pour quelques millièmes de dollars. / New medias introduce new problems and risks. There are important security and privacy considerations related to online interactions. Users browsing the Web leave a constant trail of traces referring to their Web actions. A large number of entities take advantage of this data to constantly improve how the Web services function, often offering rich personalization capabilities -- to achieve this, user data is needed. To obtain user data, Web users are being tracked and profiled. Having user data may help enhancing functionality and usability, but it also has the potential of introducing complex privacy problems, related to data collection, storing and processing. The incentives to gather user data are of economical nature: user data is monetized. We start with a description of privacy problems and risks, highlighting their roots in technology changes; users must constantly struggle to adapt to changes. The legal frameworks relating to privacy are about to change: Web companies will have to adopt to new realities. First part of this thesis is devoted to measuring the consequences of private data leaks and tracking. We show how Web browsing history convey insight relating to user interests. We study the risks of Web browsing history leaks. We point out that browsing history is to large extent unique; we perform this basing on a dataset of more than 350k partial history fingerprints. The consequence here is that if browsing histories are personally identifiable information (PII), the upcoming European privacy legal frameworks could potentially result in strict guidelines for their collection, storing and processing. The tracking measurement of third-party resources confirms the popular notion that most of the tracking is carried by US-based companies. This creates interesting information asymmetries, which are of great importance, especially if user data could be simply equated to financial and economical benefits. Second part discusses value of privacy. We study the emerging technology of Real-Time Bidding (RTB), online real-time auctions of ad spaces. We highlight that during the auction phase, bidders in RTB obtain user information such as the visited Web site or user location and they pay for serving ads. In other words, user data flows are strictly related to financial flows. User data is thus monetized. We expose an interesting design characteristic of RTB which allows us to monitor a channel with winning bids -- dynamically established fees bidders pay for displaying their ads. We perform a detailed measurement of RTB and study how this price for user information varies according to such aspects like time of day, user location and type of visited Web site. Using data obtained from real users, we also study the effect of user profiles. Users are indeed treated differently, based on their previously visited Web sites (browsing history). We observed variability in prices of RTB ads, based on those traits. The price for user information in RTB is volatile and typically is in the range of $0.0001-$0.001. This study also had a decidedly important transparency part. We introduced a Web browser extension allowing to discover the price that bidders in RTB pay. This demonstrates how the user awareness could be improved. In part three, we continue the transparency trail. We point out that Web browsers allow every Web site (or third-party resources they include) to record the mouse movements of their visitors. We point out that recent advances in mouse movement analysis points to the notion that mouse movements can potentially be used to recognize and track Web users across the Web; mouse movement analysis can also be used to infer users' demographics data such as age. We highlight the existence of mouse movement analytics -- third-party scripts specializing in mouse movement collections. We also suggest that Web browser vendors should consider including permissions for accessing the API enabling these kind of recordings.
Date30 January 2015
CreatorsOlejnik, Lukasz
ContributorsGrenoble Alpes, Castelluccia, Claude
Source SetsDépôt national des thèses électroniques françaises
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation, Text

Page generated in 0.0016 seconds