This stuff is tough



(Cross-posted from the Google European Public Policy Blog)

Yesterday's news that the European Commission has opened a preliminary inquiry into competition complaints from three companies has generated a lot of questions about how Google's ranking works. Here, Amit Singhal, a Google Fellow responsible for ranking, who has worked in search for almost 20 years, explains the principles behind our algorithm.

Pop quiz. Get ready. You're only going to have a few milliseconds to answer this question, so look sharp. Here goes: "know the way to San Jose?" Now display the answer on a screen that’s about 14 inches wide and 12 inches tall. Find the answer from among billions and billions of documents. Wait a second - is this for directions or are we talking about the song? Too late. Just find the answer and display it. Now on to the next question. Because you'll have to answer hundreds of millions each day to do well at this test. And in case you find yourself getting too good at it, don’t worry: at least 20% of those questions you get every day you’ll have never seen before. Sound hard? Welcome to the wild world of search at Google. More specifically, welcome to the world of ranking.

Google ranking is a collection of algorithms used to seek out relevant and useful results for a user's query. There's a ton that goes into building a state-of-the-art ranking system like ours. Our algorithms use hundreds of different signals to pick the top results for any given query. Signals are indicators of relevance, and they include items as simple as the words on a webpage or more complex calculations such as the authoritativeness of other sites linking to any given page. Those signals and our algorithms are in constant flux, and are constantly being improved. On average, we make one or two changes to them every day. Lately, I’ve been reading about whether regulators should look into dictating how search engines like Google conduct their ranking. While the debate unfolds about government-regulated search, let me provide some general thinking behind our approach to ranking. Future ranking experts (inside or outside government) might find it helpful. Our philosophy has three main elements:

1. Algorithmically-generated results.
2. No query left behind.
3. Keep it simple.

After nearly two decades, I’ve lost count of how many times I've been asked why Google chooses to generate its search results algorithmically. Here's how we see it: the web is built by people. You are the ones creating pages and linking to pages. We are utilizing all this human contribution through our algorithms to order and rank our results. We think that's a much better solution than a hand-arranged one. Other search engines approach this differently -- selecting some results one at a time, manually curating what you see on the page. We believe that approach which relies heavily on an individual's tastes and preferences just doesn't produce the quality and relevant ranking that our algorithms do. And given the hundreds of millions of queries we have to handle every day, it wouldn't be feasible to handle each by hand anyway.

This brings me to the next point: leaving no query behind. Usually once I've explained to people the thinking behind algorithmically-generated results, some will ask me, "But what if you do a search, and the results you see are just plain lousy? Why wouldn't you just go in there by hand and change them?" The part of this question that's valid is in terms of lousy results. It happens. It happens all the time. Every day we get the right answers for people, and every day we get stumped. And we love getting stumped. Because more often than not, a broken query is just a symptom of a potential improvement to be made to our ranking algorithm. Improving the underlying algorithm not only improves that one query, it improves an entire class of queries, and often for all languages around the world in over 100 countries. I should add, however, that we do have clear written policies for websites that are included in our results, and we do take action on sites that are in violation of our policies or for a small number of other reasons (such as legal requirements, child porn, spam, viruses/malware, etc.). But those cases are quite different from the notion of rearranging the page you see one result at a time.

Finally, simplicity. This seems pretty obvious. Isn't it the desire of all system architects to keep their systems simple? We work very hard to keep our system simple without compromising on the quality of results. This is an ongoing effort, and a worthy one. Our commitment to simplicity has allowed us innovate quickly, and it shows.

Ultimately, search is nowhere near a solved problem. Although I've been at this for almost two decades now, I'd still guess that search isn't quite out of its infancy yet. The science is probably just about at the point where we're crawling. Soon we'll walk. I hope that in my lifetime, I'll see search enter its adolescence.

In the meantime, we're working hard at our ongoing pop quizzes. Here's one last one: "search engine." In 0.14 seconds from among a few hundred million pages, our initial results are: AltaVista, Dogpile Web Search, Bing and Ask.com. I guess I'd better get back to work.

Serious threat to the web in Italy



(cross-posted from the Official Google Blog)


In late 2006, students at a school in Turin, Italy filmed and then uploaded a video to Google Video that showed them bullying an autistic schoolmate. The video was totally reprehensible and we took it down within hours of being notified by the Italian police. We also worked with the local police to help identify the person responsible for uploading it and she was subsequently sentenced to 10 months community service by a court in Turin, as were several other classmates who were also involved. In these rare but unpleasant cases, that's where our involvement would normally end.

But in this instance, a public prosecutor in Milan decided to indict four Google employees —David Drummond, Arvind Desikan, Peter Fleischer and George Reyes (who left the company in 2008). The charges brought against them were criminal defamation and a failure to comply with the Italian privacy code. To be clear, none of the four Googlers charged had anything to do with this video. They did not appear in it, film it, upload it or review it. None of them know the people involved or were even aware of the video's existence until after it was removed.

Nevertheless, a judge in Milan today convicted 3 of the 4 defendants — David Drummond, Peter Fleischer and George Reyes — for failure to comply with the Italian privacy code. All 4 were found not guilty of criminal defamation. In essence this ruling means that employees of hosting platforms like Google Video are criminally responsible for content that users upload. We will appeal this astonishing decision because the Google employees on trial had nothing to do with the video in question. Throughout this long process, they have displayed admirable grace and fortitude. It is outrageous that they have been subjected to a trial at all.

But we are deeply troubled by this conviction for another equally important reason. It attacks the very principles of freedom on which the Internet is built. Common sense dictates that only the person who films and uploads a video to a hosting platform could take the steps necessary to protect the privacy and obtain the consent of the people they are filming. European Union law was drafted specifically to give hosting providers a safe harbor from liability so long as they remove illegal content once they are notified of its existence. The belief, rightly in our opinion, was that a notice and take down regime of this kind would help creativity flourish and support free speech while protecting personal privacy. If that principle is swept aside and sites like Blogger, YouTube and indeed every social network and any community bulletin board, are held responsible for vetting every single piece of content that is uploaded to them — every piece of text, every photo, every file, every video — then the Web as we know it will cease to exist, and many of the economic, social, political and technological benefits it brings could disappear.

These are important points of principle, which is why we and our employees will vigorously appeal this decision.

Committed to competing fairly



(Cross-posted from the European Public Policy Blog)

As Google has grown, we've not surprisingly faced more questions about our role in the advertising ecosystem and our overall approach to competition. This kind of scrutiny goes with the territory when you are a large company. However, we've always worked hard to ensure that our success is earned the right way -- through technological innovation and great products, rather than by locking in our users or advertisers, or creating artificial barriers to entry.

The European Commission has notified us that it has received complaints from three companies: a UK price comparison site, Foundem, a French legal search engine called ejustice.fr, and Microsoft's Ciao! from Bing. While we will be providing feedback and additional information on these complaints, we are confident that our business operates in the interests of users and partners, as well as in line with European competition law.

Given that these complaints will generate interest in the media, we wanted to provide some background to them. First, search. Foundem - a member of an organisation called ICOMP which is funded partly by Microsoft - argues that our algorithms demote their site in our results because they are a vertical search engine and so a direct competitor to Google. ejustice.fr's complaint seems to echo these concerns.

We understand how important rankings can be to websites, especially commercial ones, because a higher ranking typically drives higher volumes of traffic. We are also the first to admit that our search is not perfect, but it's a very hard computer science problem to crack. Imagine having to rank the 272 million possible results for a popular query like the iPod on a 14 by 12 screen computer screen in just a few milliseconds. It's a challenge we face millions of times each day.

Our algorithms aim to rank first what people are most likely to find useful and we have nothing against vertical search sites -- indeed many vertical search engines like Moneysupermarket.com, Opodo and Expedia typically rank high in Google's results. For more information on this issue check out our guidelines for webmasters and advertisers, and for an independent analysis of Foundem's ranking issues please read this report by Econsultancy.

Regarding Ciao!, they were a long-time AdSense partner of Google's, with whom we always had a good relationship. However, after Microsoft acquired Ciao! in 2008 (renaming it Ciao! from Bing) we started receiving complaints about our standard terms and conditions. They initially took their case to the German competition authority, but it now has been transferred to Brussels.

Though each case raises slightly different issues, the question they ultimately pose is whether Google is doing anything to choke off competition or hurt our users and partners. This is not the case. We always try to listen carefully if someone has a real concern and we work hard to put our users' interests first and to compete fair and square in the market. We believe our business practices reflect those commitments.

YouTube on Fox News - Power Player of the Week



(cross-posted from the CitizenTube blog)

Chris Wallace at Fox News Sunday came to our DC office last week to do a piece for his Sunday show. Every week, FNS does a feature called "Power Player of the Week" - this week, Chris chose YouTube. We had a great conversation with him about politics on the site and how our interview with President Obama came together earlier this month.


Wallace himself is no stranger to YouTube. Before the interview, we chatted about Chris' first big YouTube moment - which was his interview with President Clinton in which Clinton became irked when Wallace pressed him on what more he could have done to go after Osama Bin Laden.

Fox now has a YouTube channel at youtube.com/foxnews.

Control your Buzz settings in Google Dashboard



Earlier this week, I noted some of the improvements we've made to Buzz based on some really helpful user feedback. We've made a few other efforts to make Buzz settings easier to manage, including adding Buzz to the Google Dashboard.

The Google Dashboard is a tool that summarizes data for each Google product you use and provides direct links to your personal settings. For Buzz, the Dashboard is another place to see how many people you're following, how many people are following you, and information about your recent posts as well as links to change your Buzz settings.

The Dashboard is just another way for users to find out more about products like Buzz -- and how to exercise choice and control over their information and their use of our products.

Check it out and let us know what you think.

Using your feedback to improve Buzz



Last Tuesday we introduced Google Buzz, a new way to start conversations and share updates, links and more. In the days since we launched, we've heard a lot of feedback about how Buzz works.

We've heard your concerns loud and clear, and we've already taken steps to address them. On Thursday we made some improvements to Buzz so that privacy controls are more visible and useful. And yesterday afternoon we announced further changes that we'll be rolling out over the next few days — including modifying the start-up process so that you review our suggestions for people to follow (rather than automatically following them from the get-go), and adding a Buzz tab to the Gmail settings page so that privacy controls are more easily accessible.

At Google, we like to respond to user feedback, then iterate based on that feedback. So thanks for speaking up. We've been working quickly over the holiday weekend to incorporate the new changes into Buzz, and we'll be continuing to improve your user experience.

Experimenting with new ways to make broadband better, faster, and more available



Given how important broadband capability is to economic growth and job creation, it's no surprise that it's become a major topic of discussion in Washington.

The FCC is currently finalizing its National Broadband Plan to present to Congress next month. Recently we suggested that as part of its Plan, the Commission should build ultra high-speed broadband networks as testbeds in several communities across the country, to help learn how to bring faster and better broadband access to more people. We thought it was important to back up our policy recommendation with concrete action, so now we've decided to build an experimental network of our own.

Today we announced plans to build and test ultra high-speed broadband networks, delivering Internet speeds more than 100 times faster than what's available today to most Americans, over 1 gigabit per second fiber connections. As a first step, we're asking interested local governments to complete a request for information, which will help us determine where to build. Our goal is to experiment with new ways to help make broadband Internet access better, faster, and more widely available.

We're excited to see how consumers, small businesses, anchor institutions, and local governments will take advantage of ultra high-speed access to the Net. In the same way that the transition from dial-up to broadband made possible the emergence of online VoIP and video and countless other applications, we think that ultra high-speed bandwidth will lead to many new innovations – including streaming high-definition video content, remote data storage, distance learning, real-time multimedia collaboration, and others that we simply can't imagine yet.

This project will build on our ongoing efforts to expand and improve Internet access for consumers – from our free municipal Wi-Fi network in Mountain View, CA, to our advocacy in the 700 MHz spectrum auction, to our work to open the TV "white spaces" to unlicensed uses.

In building our broadband testbed, we plan to incorporate the policies we've been advocating for in areas like network neutrality and privacy protection. Even on a small scale, building an experimental network will also raise other important legal and policy issues, from local environmental law to rights-of-way, so we'll be working closely with communities, public officials, and other stakeholders to make sure we get this right.

By several measures, no matter who you ask, the U.S. in far too many places still lags behind many countries in Europe and Asia in terms of broadband speed, availability, and uptake. While it's unlikely that our experiment will be the silver bullet that delivers ultra high-speed Internet access to the rest of America, our engineers hope to learn some important things from this project. We can't wait to see what developers and consumers alike can accomplish with access to 1 gigabit broadband speeds.

Safety Mode: giving you more control on YouTube



(cross-posted from the Official You-Tube Blog)

Diversity of content is one of the great things about YouTube. But we know that some of you want a more controlled experience. That's why we're announcing Safety Mode, an opt-in setting that helps screen out potentially objectionable content that you may prefer not to see or don't want others in your family to stumble across while enjoying YouTube. An example of this type of content might be a newsworthy video that contains graphic violence such as a political protest or war coverage. While no filter is 100% perfect, Safety Mode is another step in our ongoing desire to give you greater control over the content you see on the site.

It's easy to opt in to Safety Mode: Just click on the link at the bottom of any video page. You can even lock your choice on that browser with your YouTube password. To learn more, check out the video below.

And remember, ALL content must still comply with our Community Guidelines. Safety Mode isn't fool proof, but it provides a greater degree of control over your YouTube experience. Safety Mode is rolling out to all users through out the day, watch for the new link at the bottom of any YouTube page.

Postponed: Today's D.C. Talk on Democracy Online



Given that most of us of are still digging out from this weekend's record snowfall, we're postponing today's D.C. Talk, Democracy Online: Can the Internet Bring Change? We hope to reschedule soon. In the meantime, keep warm!

Stanford expands Google Books agreement



Today, Stanford University announced that it has expanded our original partnership to take advantage of our settlement agreement to make millions of works from its library collection accessible to readers, researchers, and book lovers across the United States. That means that if the settlement agreement is approved by the court, anyone in the US will be able to find, preview and buy online access to books from Stanford's library. Stanford joins the University of Michigan, University of Wisconsin-Madison, and University of Texas, who also expanded their original partnerships with Google.

Google was founded on the principle of making information more accessible to more people, so we're excited that Stanford has joined in our continuing efforts to bring these books to more people around the country. You can read more at the Stanford University website here.

Your Questions for President Obama



(cross-posted from the Official YouTube Blog)

Today, President Obama had his first exclusive interview after his State of the Union speech with you, the YouTube community. The President engaged in a direct conversation about a broad range of issues, from generating jobs to opening up the health care process to investments in nuclear energy.

The best part of the process was that it was driven by you. ​Five days ago, as the President began his State of the Union address, we opened up our Moderator platform on CitizenTube, and over 55,000 of you submitted and voted on both video and text questions. Some of them were hard-hitting, others were emotional, and some were even funny.

You can watch the full interview now:



Only able to ask less than 0.2% of the 11,696 questions submitted, it was hard to choose the final handful. Here's how the selection process worked: we tried to cover a range of issues, minimize duplicate questions, and include both video and text submissions. First, we looked at which topics had the highest participation -- like jobs, foreign policy, health care and government reform -- to determine how many questions to ask in each category. We then took the top 5% of video and text questions and picked questions that reflected what you cared about. None of them were chosen by the White House or seen by the President before the interview.

In some cases, we combined questions, grouping similar ones from different categories like health care and government reform:
"Why are the health care meetings, procedures, etc not on CSPAN as promised?" - Mr. Anderson, Texas
"How do you expect the people of this country to trust you when you have repeatedly broken promises that were made on the campaign trail. Most recently, the promise to have a transparent healthcare debate..." - Warren Hunter, Brooklyn
Sometimes the top overall question in the category was a video question:



To try to get as many question in as possible, we had a section called "Good idea/Bad idea" in which we tried to solicit short responses from the President on ideas you sent in that might not be presented to him in traditional interviews. And in all cases, we tried to select the top questions that would solicit conversation, lead to substantive answers, and hadn't been asked in previous programs we've had with the President.

We had many more questions on hand than we had time to deliver, so we're pleased that the White House has agreed to respond to more of the top-voted questions in their blog soon, at whitehouse.gov.

We hope this interview brings us one step closer to creating better access to your government through YouTube -- and we'd love to hear your feedback and any other ideas you have on YouTube's political programming.

Cloud computing in the President's 2011 budget



When it comes to cloud computing, the Obama Administration is putting some skin in the game.
Everyone talks about the capacity of cloud computing to transform government and reduce costs (one study estimates that federal agencies could eventually save 85% of their IT budgets by moving to the cloud). But the vast majority of the federal government's IT spending today is spent on traditional desktop or client-server computing. And until that changes, the federal government won't have the ability to tap the true potential of cloud computing.

That's why the inclusion of cloud computing in the Obama Administration's new FY 2011 budget is a big deal. Check out page 42 of the budget overview which identifies the problem:
"Under the leadership of the Federal Chief Information Officer, the Administration is continuing its efforts to close the gap in effective technology use between the private and public sectors. Specifically, the Administration will continue to roll out less intensive and less expensive cloud-computing technologies; reduce the number and cost of Federal data centers; and work with agencies to reduce the time and effort required to acquire IT, improve the alignment of technology acquisitions with agency needs, and hold providers of IT goods and services accountable for their performance."
Later on page 321 of the Analytical Perspectives section, the Administration writes that
Adoption of a cloud computing model is a major part of the strategy to achieve efficient and effective IT. After evaluation in 2010, agencies will deploy cloud computing solutions across the Government to improve the delivery of IT services.
And on page 327, the Administration says that it will, among other things
[...] initiate pilot projects in cloud computing to transform how the Government provides computing services while taking steps to improve the security of Federal information and systems.
What about specific funding commitments? We've learned this morning that federal CIO Vivek Kundra will control a $35 million fund to set up innovative tech pilot projects, including projects using cloud computing.

These pilot projects will be a drop in the bucket of the $79.4 billion federal IT budget, but it's still a great start. Here's hoping that the congressional budget and appropriations committees agree with the Administration that cloud computing represents the future for federal government computing.