When asked about online privacy, most people say they want more information about how they are being tracked and more control over how their personal information is used. Those consumer expectations are rarely in line with the data collection practices of Internet companies, which often collect information about their users not only on their own sites, but also when those users visit other sites across the Web.
Those are some of the central findings of a new privacy study conducted by a group of graduate students at the University of California, Berkeley, which was released late Monday. The students at the School of Information – Joshua Gomez, Travis Pinnick and Ashkan Soltani – studied consumer expectations by looking at sources like complaints filed with the Federal Trade Commission and data collected by the state of California and a privacy group. They analyzed company practices using Ghostery, a browser plug-in that detects cookies, Web beacons and other types of trackers that allow third parties to gather information about Web site visitors, often without their knowledge.
Google showed up as the most conspicuous tracker on third-party sites. Google Analytics, a free product that allows online publishers to gather statistics about visitors to their sites, was used on 81 of the top 100 sites. Cookies from the advertising company DoubleClick, which is owned by Google, were present on 70 of those sites. When combining trackers from those two services, Google had a presence on 92 of the top 100 sites. Others weren’t far behind. Cookies from Atlas, Microsoft’s DoubleClick rival, appeared on 60 sites, and trackers from two other analytics companies, Quantcast and Omniture, showed up on 54 sites.
The findings roughly line up with those in other studies of third-party tracking on the Web. Researchers from AT&T Labs and Worcester Polytechnic Institute, for instance, looked at a much larger sample of 1,200 popular Web sites and found Google trackers on 61 percent of them. Omniture’s tracker was on 34 percent and Microsoft’s on 24 percent.
What is striking in the Berkeley students’ report is that in a sample of nearly 400,000 Web domains, Google’s presence remained high, at 88 percent, while those of other companies declined sharply. The second most frequent tracker in that sample was from an analytics company called StatCounter, which appeared on only 7 percent of domains. Assuming the data is accurate, it is a testimony to the widespread popularity of Google’s services like Analytics, DoubleClick and AdSense, the company’s contextual advertising network, which is used by a large percentage of Web sites small and large.
“I don’t know that anyone has identified the scope and depth of the coverage that Google has across the Web in terms of tracking,” Mr. Soltani said. “Our data shows that even if you are not going to Google, if you are browsing the Web they are collecting data about you.”
The implications of the study, however, are not exactly clear. “We are not claiming that Google aggregates information from each of these trackers into a central database, though it does possess the capability to do so,” the researchers wrote.
But Google disputes even that. For instance, it said that the cookies used by its analytics service are different on each Web site, so they do not allow the company to track a user from site to site. “It doesn’t enable any cross-site tracking,” said Mike Yang, managing counsel at Google. Mr. Yang also said Google’s contracts with customers do not allow it to merge data from various services like DoubleClick and AdSense, or to link that data to personal information that Google collects when users sign up for its other services.
What’s more, the data from the Berkeley study, which reports the presence of trackers by domain, can overstate the amount of tracking that is taking place. Many large domains like MySpace can include multiple sites with thousands of pages, if not tens or hundreds of thousands of pages. The presence of a tracker on one site or page doesn’t mean users are tracked across the entire domain.
Still, the numbers are eye-catching. And as important as the numbers themselves is what the study says about the disconnect between how Americans conceive of privacy, company practices and the government’s approach to regulation of those practices, said Chris Hoofnagle, director of the Berkeley Center for Law and Technology’s information privacy programs, who helped advise the students.
“Consumers were complaining to the F.T.C. about a lack of control over personal information,” Mr. Hoofnagle said. “That is very different from how the F.T.C. has framed the issue,” he said, noting that under the Bush administration, the agency frowned on privacy practices only if they caused harm to consumers.
Mr. Hoofnagle added: “We have a new F.T.C. now. They may scrap the ‘harm’ approach and look at some other method for balancing rights and responsibilities.”