Computers

XRay software reveals how Gmail, Amazon and YouTube leverage your personal data

XRay software reveals how Gmail, Amazon and YouTube leverage your personal data
XRay helps users see how popular online services track their personal information (Image: Alexander Supertramp / Shutterstock)
XRay helps users see how popular online services track their personal information (Image: Alexander Supertramp / Shutterstock)
View 1 Image
XRay helps users see how popular online services track their personal information (Image: Alexander Supertramp / Shutterstock)
1/1
XRay helps users see how popular online services track their personal information (Image: Alexander Supertramp / Shutterstock)

A new software tool developed at Columbia University is providing valuable insights into how some very popular websites make use of the sensitive data they collect from their users. The software could help sniff out potential abuses from advertisers and contribute in making the usage of sensitive data a lot more transparent to the end user.

It is no secret that many of the most popular websites and online services actively track and store sensitive information from their users. The data they collect can includes location, emails and search histories, which companies then attempt to monetize to the best of their ability – for instance, by producing better targeted ads, video suggestions and product recommendations.

Some of these services can be very useful, as they improve the user experience; however, it is very difficult to tell exactly how this sensitive data is being used, and that is a problem. As web services keep aggressively collecting more and more personal data to profile their users and maximize profits, it is important to make sure that this information is used in an ethical way, preventing abuses or morally questionable business practices (such as in the case of credit companies reportedly adjusting loan offers based on users' Facebook activity).

Columbia University researchers Roxana Geambasu, Augustin Chaintreau and Mathias Lecuyer have developed XRay, a software tool that aims to address this issue and bring more transparency to the web.

Their system works by tracking how the user's behavior influences "user targeting" including personalized advertisements, product recommendations and video suggestions. It then uses a probabilistic mathematical model to correlate the inputs and the outputs – the user behavior and the targeting from the website – to give users a good sense of how their personal data is being used. According to the researchers, their system has been able to predict user targeting with an accuracy of 80 to 90 percent.

In its current iteration, XRay can analyze data from Google Gmail, Amazon and YouTube. However, because of its highly flexible black box approach, Geambasu and colleagues say it could be easily adapted to new websites, even tracking data across multiple services.

The scientists created a set of emails with keywords, some of which included sensitive information, and then used XRay to examine what ads would appear to specifically target those messages.

The analysis that followed concluded that it is indeed possible for advertisers to target sensitive topics in users' inboxes, particularly with respect to health issues – including cancer, depression and pregnancy. The scientists also discovered actual examples of such abuses, such as advertisements that targeted the topic of debt on users' inboxes in order to advertise subprime loans for second-hand cars.

XRay is still in its early stages of development, but the researchers hope that releasing the software under an open-source license, as they have done, will help the development of a new generation of software tools that can ultimately help make the web a lot more transparent.

A live online demo of the software helps users better understand the ad targeting specifically for Gmail.

Source: Columbia University

5 comments
5 comments
Mel Tisdale
From what I know about Google, and Facebook particularly, which I suspect also operates in the political sphere, I am not in the least surprised. Unfortunately, it is a reflection of modern governments that nationalising them would probably only make matters worse.
I don't think that George Orwell would be surprised by this article.
Flabba Wabbajabba
When I send my emails to my recipients, I do not give my permission for anyone or anything else to eavesdrop or read my message.
By law, I own the copyright to my own original works.
Someone needs to take these giants to court and stop them from this disgusting abuse. Imagine if google was in the post office opening and reading everyone's letters, and slipping advertising in before re-sealing them. Besides the word "post office", this is exactly what's going on.
The recipient DOES NOT HAVE PERMISSION to allow google or anyone else to read MY email. Only I do, and I don't grant that permission!
The thing that REALLY makes me mad though, is when google advertises my competitors to my own customers, directly in the emails I send to my customers. This is the lowest of the low. And when I asked them to stop, or to give me a way to specify which competitors not to show alongside the emails I send - they said "no". They're illegally stealing private info about my company from my emails in order to sabotage my business by selling advertising that competes against me directly to my customers, inside my own emails to them!
And guess what I can do about it? Nothing. Make google mad, and they'll drop you form their index, and your business life is over.
flink
I don't understand how, after all these years, that anyone still thinks that google actually "reads" their email.
Google's system scans email for a list of keywords. No one is reading them.
Suman M Subramanian
Anyone else remember when Google's mantra was, "Don't be evil"?