Researchers at U.C. Berkeley have discovered that some of the net’s most popular sites are using a tracking service that can’t be evaded — even when users block cookies, turn off storage in Flash, or use browsers’ “incognito” functions.
The service, called KISSmetrics, is used by sites to track the number of visitors, what the visitors do on the site, and where they come to the site from — and the company says it does a more comprehensive job than its competitors such as Google Analytics.
But the researchers say the site is using sneaky techniques to prevent users from opting out of being tracked on popular sites, including the TV streaming site Hulu.com.
The discovery of KISSmetrics tracking techniques comes as federal regulators, browser makers, privacy activists and ad tracking companies are trying to define what tracking actually is. The FTC called on browser makers to add a “Do Not Track” setting that essentially lets users tell websites not to leave them alone — though it doesn’t block tracking on its own. It’s more like a “privacy, please” sign on a hotel door. One of the big questions surrounding Do Not Track is about web analytics software, which sites use to determine what’s popular on their site, how many unique visitors a site has a month, where users are coming from, and what pages they leave from.
In response to inquiries from Wired.com, Hulu cut ties with KISSmetrics on Friday.
UPDATE 5:00 PM Friday: Spotify, another KISSmetrics customer named in the report, said that it was concerned by the story:
“We take the privacy of our users incredibly seriously and are concerned by this report,” a spokeswoman said by e-mail. “As a result, we have taken immediate action in suspending our use of KISSmetrics whilst the situation is investigated.” /UPDATE
“Hulu has suspended our use of KISSmetrics’ services pending further investigation,” a spokeswoman told Wired.com. “Hulu takes our users’ privacy very seriously. We have no further comment at this time.”
KISSmetrics is a 17-person start-up founded in 2008 and based in the San Francisco Bay Area. Founder Hitten Shah confirmed that the research was correct, but told Wired.com Friday morning that there was nothing illegal about the techniques it was using.
“We don’t do it for malicious reasons. We don’t do it for tracking people across the web,” Shah said. “I would be having lawyers talk to you if we were doing anything malicious.”
Shah says KISSmetrics is used by thousands of sites to track incoming users, and it does not sell or buy data about those visitors, according to Shah. After this story was published, the company tweeted a link that explains how its tracking works.
So if a user came to Hulu.com from an ad on Facebook, and then later, using a different browser on the same computer, visited Hulu.com from Google, and then at some point signed up for the premium service, KISSmetrics would be able to tell Hulu all about that user’s path to purchase (without knowing who that person was). That tracking trail would remain in place even if a user deleted her cookies, due to code that stores the unique ID in places other than in a traditional cookie.
The research was published Friday by a team UC Berkeley privacy researchers that includes veteran privacy lawyer Chris Hoofnagle and noted privacy researcher Ashkan Soltani.
“The stuff works even if you have all cookies blocked and private-browsing mode enabled,” Soltani said. “The code itself is pretty damning.”
The researchers were reprising a study from 2009 which discovered that some of the net’s biggest sites were using technology from online ad tracking firms Clearspring and Quantcast to re-create users’ cookies after users deleted them. The technique involved using a little known property of Flash to hold onto unique ID numbers. Then, if a user deleted her cookies, the companies would check in the secondary stash for the user ID, and use it to resurrect the traditional HTML cookies.
That finding led to inquiries from regulators and a class action lawsuit alleging that websites and the tracking companies were unfairly monitoring users. That suit was settled for $2.4 million in cash and a promise by Clearspring and Quantcast not to use that method again.
One of the sites named in that suit was Hulu, but its part of the settlement only required that the company tell users if it was using Flash to store cookies and provide a link in the policy that would show users how to turn off Flash data storage. However with KISSmetrics running, even knowing how to do that wouldn’t have saved a user from persistent tracking.
This go-round the researchers’ report found only two sites that were recreating cookies after users deleted them — and Hulu.com was the only one doing so for tracking users across the entire site.
The researchers dug into Hulu.com’s tracking code and discovered the KISSmetrics code. Using it, Hulu was able to track users regardless of which browser they used or whether they deleted their cookies. KISSmetrics used a number of methods to recreate cookies, and the persistent tracking can only be avoided by erasing the browser cache between visits.
They also say that Shah’s defense that the system is not used to track people around the web doesn’t hold up.
“Both the Hulu and KISSmetrics code is pretty enlightening,” Soltani told Wired.com in an e-mail. “These services are using practically every known method to circumvent user attempts to protect their privacy (Cookies, Flash Cookies, HTML5, CSS, Cache Cookies/Etags…) creating a perpetual game of privacy ‘whack-a-mole’.”
“This is yet another example of the continued arms-race that consumers are engaged in when trying to protect their privacy online since advertisers are incentivized to come up with more pervasive tracking mechanisms unless there’s policy restrictions to prevent it.”
They point to their research that found that when a user visited Hulu.com, they would get a “third-party” cookie set by KISSmetrics with a tracking ID number. KISSmetrics would pass that number to Hulu, allowing Hulu to use it for its own cookie. Then if a user visited another site that was using KISSmetrics, that site’s cookie would get the exact same number as well.
So that makes it possible, the researchers say, for any two sites using KISSmetrics to compare their databases, and ask things like “Hey, what do you know about user 345627?” and the other site could say “his name is John Smith and his email address is Cette adresse email est protégée contre les robots des spammeurs, vous devez activer Javascript pour la voir. and he likes these kinds of things.”
Shah did not respond to a follow-up e-mail seeking clarification on his first answers.
KISSmetrics is used by a number of prominent websites, which Wired.com is not naming until we have time to contact them.
Berkeley researcher Soltani, who consulted for the Wall Street Journal’s reporting on privacy, notes that the code includes function names like “cram cookie.”
One of the techniques used involves using something called ETags in the browser cache, a once-theoretical technique that’s never before been seen in the wild on a major site, according to the researchers.
The research also found that many top websites have adopted new ways to track users using HTML5 and that Google tracking cookies are present on 97 of the top sites, including government sites such as IRS.gov.
A screenshot of a browser cache cookie, which researchers say has never been seen in the wild before.
Further resources:
- The actual Flash/HTML5/Cache/Etags respawning code used by KISSmetrics on Hulu: code, pastebin
- Hulu’s own code to respawn cookies: code, see it on ShowMyCode by entering https://www.hulu.com/guid.swf?v2
- The full report from the Berkeley researchers
- An image from Ashkan Soltani showing tracking ID being set in a browser even with cookies blocked and in ‘private’ browsing mode.
Authors: