Sunday, August 28, 2005

Google anything, so long as its not Google

Google Anything, so Long as It's Not Google

IF you were Google's C.E.O., wouldn't you Google yourself? At least once? Would you be surprised to discover that your recent stock sales, net worth, hobbies and contributions to various political candidates are online and easily reached with a click or two?

That your home address pops up so readily - O.K., that may have come as a surprise - shows that a person can no longer designate which piece of personal information becomes public and which remains private.

So why, if you're Eric E. Schmidt, the chairman and chief executive of Google, a soft-spoken person without a history of intemperate action, do you furiously strike at the poor messenger who delivers the news that your company's search service works very well indeed?

Last month, Elinor Mills, a writer for CNET News, a technology news Web site, set out to explore the power of search engines to penetrate the personal realm: she gave herself 30 minutes to see how much she could unearth about Mr. Schmidt by using his company's own service. The resulting article, published online at CNET's News.com under the sedate headline "Google Balances Privacy, Reach," was anything but sensationalist. It mentioned the types of information about Mr. Schmidt that she found, providing some examples and links, and then moved on to a discussion of the larger issues. She even credited Google with sensitivity to privacy concerns.

When Ms. Mills's article appeared, however, the company reacted in a way better suited to a 16th-century monarchy than a 21st-century democracy with an independent press. David Krane, Google's director of public relations, called CNET.com's editor in chief to complain about the disclosure of Mr. Schmidt's private information, and then Mr. Krane called back to announce that the company would not speak to any reporter from CNET for a year.

CNET's transgression is unspeakable - literally so. When I contacted Mr. Krane last week, he said he was not authorized to speak about the incident.

Mr. Schmidt and his staff have had six weeks to restore a working relationship with CNET (and to apologize). They have not done so, leaving intact the impression that CNET committed lèse-majesté. So, too, did Fortune magazine in 1997, when it published a profile of Louis V. Gerstner, then the I.B.M. chairman. I.B.M. cut off contact with the offending magazine and pulled all advertising for good measure. The company did not explain the action, leaving readers to wonder whether Mr. Gerstner had been piqued by the magazine's description of his get-outta-my-way manner on the golf course.

More recently, Apple Computer earlier this year pulled from the shelves of Apple's retail stores all titles published by John Wiley & Sons, the publisher of a biography of Steve Jobs that displeased His Highness.

Mr. Schmidt's is a special case, however. He or his proxy apparently was angered by a journalist who did nothing more than use for policy discussion Mr. Schmidt's own service to gather publicly available material. Mr. Schmidt's home address comes from a Federal Election Commission database, which lists this and other details about donors who contribute more than $200 in a year to a candidate. If CNET's mention of the readily available information discomfited Mr. Schmidt, it should not have. Two months previously, when Google was host of a briefing for members of the news media, it was Mr. Schmidt who had explained his company's ambitions so boldly: "When we talk about organizing all of the world's information, we mean all."

Providing access to all information increasingly puts Google in the same defensive position as CNET, repeating the same refrain: This stuff is already out there. Two Dutch politicians created a stir this month when they formally asked the Dutch government to investigate the possibility that Google Earth, which provides aerial views of most everywhere, including the Hague and Schiphol Airport in Amsterdam, could be used by terrorists. But those images, Google countered, are already available from commercial sources. Google said last week that it had "proactively" reached out to the United States Defense Department to see if it had security concerns, adding that the department had not registered any to date.

More access to information, thanks to improved search-engine indexing, is better than less. But increased vulnerability comes with the package, as the Dutch and Mr. Schmidt have found.

Book publishers are feeling insecure in their own houses, too, as whirring search-engine bots hover above them. The information in question is substantively different, however, than that held in government databases in the public domain, freely available to the citizenry. The contents of books are protected by copyright law, and are most emphatically not freely available.

Google acknowledged the special protections extended to books under copyright when it began negotiating with publishers last year for permission to scan and then index the contents of copyrighted books. Publishers were receptive, contracts were signed, progress was made. But Google subsequently found a speedier way to proceed: simply borrowing the bound print collections of the Harvard, Stanford and University of Michigan research libraries for its scanning. When the project is completed, Google will retain perfect digital copies, as well as provide copies to the libraries. No muss, no fuss, no negotiations with copyright holders.

Authors and publishers were astounded. Peter Givler, the executive director of the Association of American University Presses, said Google's action was "tantamount to saying that Google can make copies of every copyrighted work ever published, period." The courts, he said, have never recognized a claim such as Google's that "fair use," which permits limited copying for research purposes, would permit the copying of an entire book.

The cries of protest from publishers have not abated with the passage of time. This month, Google announced that it would go ahead and copy all books unless the publisher elects by November to opt out, title by title. Allan Adler, the vice president for legal and government affairs of the Association of American Publishers, asked Google to imagine what its own reaction would be were others to help themselves to Google's intellectual property covered by patents, with the burden placed on Google to find out about the use and opt out. He described Google's recent actions as "a very aggressive, pushy style that says, 'We don't care that your business is different than ours.' "

THE Association of American University Presses is less concerned about the bits from books that would appear in Google's search results than about digital copies of each work, infinitely reproducible, whose use and safekeeping would not be governed by an agreement with the copyright holder. Michigan's contract with Google reserves for the university the right in the future to share its digital copies with "partner research libraries."

Mr. Givler is especially concerned about this clause, as university presses rely heavily upon sales to university libraries. Without copyright protection, it is not far-fetched to imagine a day when one copy of a Google-scanned digital book will suffice for an entire network of "partner research libraries," swapping rights without payment to the publisher. When asked what a press will do when it is able to sell only a single bound copy of a scholarly work, Mr. Givler laughed mirthlessly and said, "Charge $40,000 for the one copy."

One of the personal items revealed when CNET Googled Mr. Schmidt was a speaker's biography that he had apparently provided the Computer History Museum for a talk he gave four years ago. He described himself then as a "political junkie who never tires of debating the great issues of our day." Very well, Mr. Schmidt. When CNET next calls, please pick up the phone and let this debate begin.

Randall Stross is a historian and author based in Silicon Valley. E-mail:

ddomain@nytimes.com.

Google balances privacy, reach

By Elinor Mills
http://news.com.com/Google+balances+privacy%2C+reach/2100-1032_3-5787483.html

Story last modified Thu Jul 14 04:00:00 PDT 2005

Google CEO Eric Schmidt doesn't reveal much about himself on his home page.

But spending 30 minutes on the Google search engine lets one discover that Schmidt, 50, was worth an estimated $1.5 billion last year. Earlier this year, he pulled in almost $90 million from sales of Google stock and made at least another $50 million selling shares in the past two months as the stock leaped to more than $300 a share.

He and his wife Wendy live in the affluent town of Atherton, Calif., where, at a $10,000-a-plate political fund-raiser five years ago, presidential candidate Al Gore and his wife Tipper danced as Elton John belted out "Bennie and the Jets."

Schmidt has also roamed the desert at the Burning Man art festival in Nevada, and is an avid amateur pilot.

That such detailed personal information is so readily available on public Web sites makes most people uncomfortable. But it's nothing compared with the information Google collects and doesn't make public.

What Google knows about you

• Gmail -- The e-mail service offers two gigabytes of free storage and scans the content of messages to serve up context-related ads.

• Cookies -- Google uses cookies, which are commonly used to link individual users with activities.

• Desktop Search -- Google's Desktop Search lets users easily search files stored on their computer.

• Web Accelerator -- The application speeds Web surfing by storing cached copies of Web pages you've visited; those page requests can include personal information.

Assuming Schmidt uses his company's services, someone with access to Google's databases could find out what he writes in his e-mails and to whom he sends them, where he shops online or even what restaurants he's located via online maps. Like so many other Google users, his virtual life has been meticulously recorded.

The fear, of course, is that hackers, zealous government investigators, or even a Google insider who falls short of the company's ethics standards could abuse that information. Google, some worry, is amassing a tempting record of personal information, and the onus is on the Mountain View, Calif., company to keep that information under wraps.

Privacy advocates say information collected at Yahoo, Microsoft's MSN, Amazon.com's A-9 and other search and e-commerce companies poses similar risks. Indeed, many of those companies' business plans tend to mimic what Google is trying to do, and some are less careful with the data they collect. But Google, which has more than a 50 percent share of the U.S. search engine market, according to the latest data from WebSideStory, has become a lightning rod for privacy concerns because of its high profile and its unmatched impact on the Internet community.

"Google is poised to trump Microsoft in its potential to invade privacy, and it's very hard for many consumers to get it because the Google brand name has so much trust," said Chris Hoofnagle of the Electronic Privacy Information Center. "But if you step back and look at the suite of products and how they are used, you realize Google can have a lot of personal information about individuals' Internet habits--e-mail, saving search history, images, personal information from (social network site) Orkut--it represents a significant threat to privacy."

Kevin Bankston, staff attorney at the Electronic Frontier Foundation, said Google is amassing data that could create some of the most detailed individual profiles ever devised.

"Your search history shows your associations, beliefs, perhaps your medical problems. The things you Google for define you," Bankston said.

The Google record
As is typical for search engines, Google retains log files that record search terms used, Web sites visited and the Internet Protocol address and browser type of the computer for every single search conducted through its Web site.

In addition, search engines are collecting personally identifiable information in order to offer certain services. For instance, Gmail asks for name and e-mail address. By comparison, Yahoo's registration also asks for address, phone number, birth date, gender and occupation and may ask for home address and Social Security number for financial services.

"It's data that's practically a printout of what's going on in your brain: What you are thinking of buying, who you talk to, what you talk about."
--Kevin Bankston, staff attorney, Electronic Frontier Foundation

If search history, e-mail and registration information were combined, a company could see intimate details about a person's health, sex life, religion, financial status and buying preferences.

It's "data that's practically a printout of what's going on in your brain: What you are thinking of buying, who you talk to, what you talk about," Bankston said. "It is an unprecedented amount of personal information, and these third parties (such as Google) have carte blanche control over that information."

Google uses the log information to analyze traffic in order to prevent people from rigging search results, for blocking denial-of-service attacks and to improve search services, said Nicole Wong, associate general counsel at Google.

Personally identifiable information that is required for consumers to register for and log in to Google services is not shared with any outside companies or used for marketing, according to Google's privacy policy, except with the consent of the user, or if outside "trusted" parties

Correction: The original article incorrectly implied that Google Desktop Search can track what's stored on a user's PC. The service does not expose a user's content to Google or anyone else without the user's explicit permission.

need it to process the data on Google's behalf.

Concern about Google's data retention practices has become more acute since the company went public last August. The company's motto of doing no evil remains, but some people question Google's ability to adequately balance the heavy burden of safeguarding consumer privacy rights with the pull toward intermingling and mining data for ever more lucrative targeted advertising.

"Although Google is held in high esteem by the public as a good corporate citizen, past performance is no guarantee of future behavior, especially following Google's IPO when the company will have a legal duty to maximize shareholder wealth," Hoofnagle said in testimony in March before the California Senate Judiciary Committee on the privacy risks of e-mail scanning.

Google can't make promises about what it will or won't do with the data in the future or state explicitly how it uses the information, but executives there do believe their privacy policy provides adequate assurances to calm consumers' fears.

"It's very hard for many consumers to get it, because the Google brand name has so much trust."
--Chris Hoofnagle, director, Electronic Privacy Information Center

Google's privacy policy says it may share information submitted under a Google account service "among all of our services in order to provide you with a seamless experience and to improve the quality of our services." Google representatives wouldn't elaborate on what that means.

Yahoo's privacy policy, by comparison, says it "may combine information about you that we have with information we obtain from business partners or other companies" and that it uses the data to customize the advertising and content that users see, contact users, conduct research and improve services.

Google, like virtually all companies, also complies with legal orders such as search warrants and subpoenas.

"The prospect of unlimited data retention creates a honey pot for law enforcement," Hoofnagle said in his testimony. In addition, e-mail stored for longer than 180 days has less protection from law enforcement than e-mail deleted before then, he said.

Google knows people are worried
Google is very much concerned with protecting the privacy of its users, Wong said. "We take privacy very seriously from the design of the products through launch and beyond," including by building in privacy-protection options in new products, she said. Google does not have a privacy officer, but it does have Wong and a team of lawyers who work with her to address privacy issues, among other matters.

Google executives would not say exactly how the company protects the data or whether it encrypts it. The privacy policy states that Google takes "appropriate security measures to protect against unauthorized access to or unauthorized alteration, disclosure or destruction of data" and restricts access to personally identifying information to employees "who need to know that information in order to operate, develop or improve our services."

Even if Google is well-intentioned, the data could eventually end up being misused, Bankston fears.

"I think the mantra of not being evil is not disingenuous, but it is a hard credo to stick to when you're a public corporation with stockholders to please and economic incentives driving you to collect as much information as possible," Bankston said. "I'm not saying it's evil to collect this information; I'm saying it's dangerous for them to collect this."

The largest outcry against Google so far has been in response to Gmail. Launched in April 2004, Gmail now offers a whopping two gigabytes of storage for free and scans the content of messages to serve up context-related ads.

Gmail users can delete messages, but the process isn't intuitive. Deletion takes multiple steps to accomplish and it takes an undetermined period of time to delete the messages from all the Google servers that may have a copy of it, Wong said.

Another complaint is that Google uses cookies--tiny tracking tags used by most Web sites to link a specific user with his or her activities--that expire in 2038. "Although Google said that it does not cross-reference the cookies, nothing is stopping them from doing so at any time," Hoofnagle said in his testimony. However, users can delete cookies or disable them.

People can use Google search without a cookie. If a cookie is used and is not deleted by the user, the searches may then be linked to the cookie, Wong said. However, Google can not correlate searches to a specific user unless that person voluntarily provides personallyidentifiable information. For example, Google does not correlate Gmail accounts with users' searches.

Google's Desktop Search, an application that lets users search for personal files and Web history stored locally on their computer, also created a stir when it was launched last year. Privacy advocates worried that someone with access to a user's computer could easily search for sensitive data.

A free version of Google's Desktop Search for businesses has an option that allows users to require a password to access it. The free consumer version of it does not.

Other privacy concerns were raised with Google's Web Accelerator, downloadable software for broadband users that was designed to speed access to Web pages by serving up cached or compressed copies of Web sites from Google's servers. However, the service does not really retain any more data than a user's Internet service provider can.

Underpinning many of the privacy concerns is the longevity of Google's data retention.

The log files created during Web searches, and which don't personally identify the user, are kept for as long as the data "is useful," Wong said. She did not give any time frame or elaborate.

"Overall, the issues with Google are not any different from the issues you have with Yahoo, Microsoft and others."
--Danny Sullivan, editor, Search Engine Watch

Google is able to link log file data, cookies and Google accounts to help it identify attempts to manipulate Web site ranking on its search pages, help track down originators of denial-of-service attacks against Web sites, and provide improvements to services in general, Wong said.

Concerned Googlers can either choose not to register for Google services or use two browsers, one for their Web searches and another for Gmail and other Google services.

For the more paranoid there are anonymizing proxy networks, such as the EFF's Tor, that bounce Internet communication through a series of routers that encrypt and decrypt it so that the origination and destination cannot be traced.

"Before you Google for something, think about whether you want that on your permanent record," Bankston advised. "If not, don't Google, or take steps so the search can't be tied back to you."

Google is no DoubleClick
In fairness, the level of anxiety hasn't come close to what online ad network DoubleClick faced in the late 1990s. DoubleClick became the subject of a Federal Trade Communication lawsuit for its attempt to combine offline and online consumer data. It settled federal and state suits and eventually phased out its Internet ad profiling service.

In a question-and-answer session during Google's media day in May Schmidt addressed the trade-off between privacy issues and offering better services.

"Our general philosophy on those things is very much to allow people to opt in," Schmidt said. "There are always options to not use that set of technology and remain anonymous with respect to the functionality that you're using on Google."

Gartner analyst Allen Weiner opined: "Overall, I think the privacy concerns are probably overblown."

Search engines have reached a plateau in their ability to serve up the best results, Weiner said, adding that tracking users' ongoing searches will lead to improvements.

"Have search engines gotten to the point where they have developed enough trust with consumers in order to get them to give up some of their privacy?" he asked rhetorically. "At some point there's a leap of faith that needs to occur."

And it's not as though Google is the only company asking Web surfers to make that leap, said Danny Sullivan editor of Search Engine Watch. "Overall, the issues with Google are not any different from the issues you have with Yahoo, Microsoft and others. They tend to get singled out, and unfairly, in my view," Sullivan said. "They're the biggest, and they make a big target for someone to take a swing at. It's not that the issues are not important. It's that they are applicable to the search industry" as a whole.

Trust is the key. As software industry analyst Stephen O'Grady wrote in his Tecosystems blog late last year: "Google is nearing a crossroads in determining its future path. They can take the Microsoft fork--and face the same scrutiny Microsoft does, or they can learn what the folks from Redmond have: Trust is hard to earn, easy to lose and nearly impossible to win back."

No comments: