Logo IMG
HOME > PAST ISSUE > March-April 2014 > Article Detail


Uniquely Me!

How much information does it take to single out one person among billions?

Brian Hayes

History Sniffing

2014-03CompsciFp107bot.jpgClick to Enlarge ImageOn the Internet, they say, nobody knows you’re a dog. But everything else about you becomes marketing data for sale or trade.

Sharing information is what the Internet is all about, but most of us would like to retain some measure of control over the process. In particular, when you visit a website that doesn’t require you to log in with a user name and password, you might think you could remain anonymous. But some sites go to extraordinary lengths to assign you a uniquely identifying profile.

One notorious technique is called history sniffing. Web browsers keep a list of visited URLs for the convenience of the user; the list is not supposed to be available to the websites you visit. But ingenious programmers have found ways to probe the list’s content.

Browsers do offer a method to detect stylistic features of displayed information, such as the color of text. And visited links can be styled differently than unvisited ones. These facts set the scene for a privacy leak. An inquisitive website can include—hidden somewhere in the content it sends you—a list of links to various URLs. Also downloaded is a program (written in the JavaScript language) that checks the displayed color of each link. For every link that shows up as having been visited, the program sends a signal back to the web server.

This procedure does not answer the direct question, “What URLs are on your history list?” But it answers a series of yes-or-no questions of the form,“Have you recently visited site X?” Compiling a useful profile in this way might require asking about thousands of sites, which makes the technique grossly inefficient. But all the work of running the JavaScript program is done by your computer, not by the website’s server. And the web user whose browsing habits are being recorded is generally unaware of what’s going on; the long list of URLs is never actually displayed.

The operator of a website might be eager to peer into your history list for several reasons. For example, an online merchant might like to know if you have been shopping the competition. But even if the specific sites on the list are not of interest, the spectrum of yes-or-no responses can serve as an identifying fingerprint. What are the odds that your browsing history sets you apart from all others? Lukasz Olejnik of INRIA Grenoble and two colleagues collected history profiles from consenting volunteers. Out of 223,000 profiles in which they were able to detect at least four visited sites, 98 percent of the profiles were unique.

History sniffing has some defensible uses, but the potential for abuse was recognized early on, and recent versions of major browsers attempt to block history probes. Visited links are still rendered distinctively on the screen, but if a JavaScript program asks about that formatting, the browser lies, reporting that all links are unvisited.

In spite of these countermeasures, history sniffing has not disappeared. Last year a company called Dataium was accused of using history sniffing (among other techniques) to track the activities of automobile shoppers across 10,000 websites; in a negotiated settlement, Dataium agreed to abandon the practice. An earlier case against the advertising network Epic Marketplace reached a similar conclusion.

Meanwhile, other devious history-sniffing methods have come along. Instead of examining the format of a link, a program can measure the time needed to load an image from a site; a quick response to the request probably indicates that the image was already present in your browser’s memory cache following a recent visit.

comments powered by Disqus


Subscribe to American Scientist