In previous posts, I wrote about the epic battles that are brewing between spammers and content farms—that are turning the web into a massive garbage dump, and search provider—that have to choose between profit and customer satisfaction. This is a serious problem. The content farms are “dumbing down” the web by churning out thousands of mostly low-quality articles, every day, on topics that Google tells them they can make money from. All of these players are raking in billions of dollars at our expense.

I had the opportunity to moderate a panel discussion between Google, Microsoft, and Blekko, this week. The event, which I emceed, was called Farsight 2011: Beyond the Search Box, and was organized by BigThink and Microsoft.  As I joked, it seemed odd that Google was playing the role of “evil” monopolist, Microsoft, the “good” contender, while Blekko was a fly on the wall.

I had originally invited Google SEO development head, Amit Singhal and Microsoft Research GM, Ashok Chandra. But Ashok dropped out in favor of Corporate VP of Bing, Harry Shum; and Amit dropped out in favor of Webspam team head, Matt Cutts. I was very disappointed, because Matt has the reputation of being a really nice guy—a “teddy bear”. Harry has the reputation of being feisty. I was afraid that Harry and Rich Skrenta (Blekko CEO) would devour Matt. And it would seem as if I had set up an ambush for Google.

Little did I know that Google had its own ambush cooking: that Matt was more like a tiger than a teddy bear.

At the conference, I gave my spiel on my vision of search: how I want my computer to serve me and tell me what I want to know, rather than my having to cater to its whims by entering specific keywords in a text box and reading through text links—which are often baited by spammers. I challenged Matt to tell everyone what Google was doing about the spam. Matt, instead, went on the warpath and accused Bing of stealing Google’s information. He disclosed a sting operation that his team had run. He expressed outrage at Microsoft’s ethics. Harry Shum fired back, defended Bing, and accused Google of playing games.

There has been extensive media coverage about this. Harry Shum and Yusuf Mehdi of Microsoft both posted blogs to respond to Google’s allegations. So I don’t need to visit the same territory. You can watch the video of the event and form your own opinions. There was a lot more discussed in 40 minutes than was covered by the media, so this is worth watching.

Both sides have strong views and believe they are right. In opening the debate, I said that as a professor, I can’t condone any kind of plagiarism or cheating—and that is what Microsoft’s usage of Google data seems to amount to. But in the tech world, such information exchange is the norm. Everyone cheats and this may be a good thing for innovation. So there is no black and white here. Both sides are right and they are wrong.

The one thing that is clear is that Google pulled off a huge PR coup. It changed the topic. Media coverage isn’t about spam and how Google profits from this anymore; we are debating how valuable Google’s search results are. Here are the real issues we should be discussing:

  1. Who really owns the data that Google and Bing are tussling over? Is it the search providers—that “cheat” and copy from all over the web? Or is it the content creators—us—who they “steal” from? Why do Google and Microsoft believe that they own our information? And why aren’t they paying us for using this?
  2. Facebook rivals Google in web traffic and will get way ahead. And Google can’t search within Facebook’s walls. Doesn’t this give a huge, long-term, advantage to Bing, which can (within limits)?
  3. Blekko announced a bold decision to block content farms—sites like eHow and Answerbag. Will Google and Microsoft take similar steps? Will they be able to forsake the revenue? Can the volumes of spam we are dealing with even be screened algorithmically or do we need curated search solutions?
  4. We need a standard measure of web quality. Google says that it has not noticed any reduction in web quality. Yet most experts agree that this has declined significantly over the past two or three years. Why doesn’t Google, as the market leader, work with its competitors to create an open measure that can be used by everyone? Let Google prove to us that it is, indeed, better than the rest.
  5. Why not allow web users to designate what sites are spam and make this information publicly available? Google lets you filter your own results, but why not share these data with everyone? Sites that believe they are unfairly labelled can lodge an appeal. Why the secrecy?

So let’s get back on topic. Harry Shum and Matt Cutts can duke it out in a bar somewhere. What I want is for them to clean up the web and give me the best search results.

  • http://www.facebook.com/people/Yaxiong-Zhao/100001420143948 Yaxiong Zhao

    he clearly is a American-born Chinese, cultivated by American cooperate culture…

  • Pingback: What I Want in My New Google | JetLib News()

  • Seo_alexander

    It is worth to note that Google did very carefully design pages and “unique” misspells, that were not indexed or linked to from other sources. By doing this careful engineering of atypical search results Google Search Engineers must have known that they removed all other factors that could influence the ranking of the content in question. By “engineering their test” this way only click data by Bing toolbar could influence the end result – all other much stronger ranking factors were removed in a “very innovative way”. Therefore the presented results or “copied results” are unusual. And they are not copies of Google rankings, but ranking results based on the Google search engineer team who wanted construct a test in such way that it would look like Bing would copy actual Google ranking results.
    Bing would in normal cases (normal misspells and normal queries) take into account also all the other ranking factors in their own algorithm to determine search ranking results.

    Google intentionally made this look like Bing would be copying Google’s search results, even if they for sure knew that Bing only used click data in an innovative way and only as a part of their complete ranking results. Those involved in the SEO / search industry and have an interest in the technology used to determining rankings can easily spot that this was a deliberate and attack by Google to make it look like Microsoft/Bing would be copying their actual ranking results. This is obviously not the case. Bing uses click data from users of their toolbar in a very innovative way.

    Note: Misspelled words may be found in the correct version on the very same pages as the misspelled version or “typo”. To feedback what human visitors are selecting and clicking on in the provided search results (along with other factors such as how long they stay on page and the site in question etc.) Is likely to be a good way to determine the quality of the search results. Human visitors are better to evaluate the content than an algorithm. Matt Cutt’s claim that Google would not use click data from their toolbar sounds surprising. Their choice. The reason must then be that Google collects click data, not from their toolbar, but from their browser and for sure whenever you are logged in to your gmail.com account.

    I am very disappointed by the fact that Danny Sullivan from SearchEngineLand accepted Google’s “material” and published in the form he did. He should with 15 years experience in search for sure have realised that Google was feeding the world with a false claim of copying rankings.
    Danny Sullivan says that in the end he “sympathised” with Google.
    I believe it should read $ympathised. Wonder, how much was Danny paid to do this?
    Google then goes on saying “we should not copy each other but innovate” – This when what Bing actually had done was a kind innovation. Sorry to say but to me Google and Danny Sullivan / Search Engine Land are the ones who loses their credibility. I can well understand that people not involved in the search industry can get fooled by Google’s claims – those who understand search should know better.

    Shame on Google, Shame on Danny.

    Alex

  • Scott Lee

    This is what happens when you hire CHINESE. They copy and on top of that they’ll tell you to be careful with any charges. Nothing new!

  • Pingback: Tweets that mention How Google Ambushed Microsoft and Changed the Subject -- Topsy.com()

  • Gianluca Spadini

    It seems that you are removing comments you do not like or if they are critical to your way of thinking. Free speech went south ?

  • Nagle

    Yes, Microsoft won that PR round. Bing was stupid to use click data in that way. But that’s a side issue. nnGoogle’s addiction to AdWords revenue offers an opportunity for other search engines to do better. Google needs those content farms. Bing does not. Blekko is already filtering them out. Bing hasn’t done as much in that direction as they could. It’s not hard to identify the major content farms. Whether to filter them is a policy issue. Moving them down in search results seems to be indicated. nnBlekko’s real innovation to date is that health-related queries are searched only against a list of sites known to be authoritative on health issues. This is valuable. So many junk sites appear in health-related results that it’s difficult to get good information. Blekko has dealt with that.nnDon’t expect a “standard measure of web quality”. That’s what defines a search engine. Google considers that metric to be their crown jewels. A standard measure of business legitimacy,however, is quite feasible. We do that at SiteTruth. We don’t have to be secretive about it, because the data sources used, from SEC data to Dun and Bradstreet ratings, come from sources that take strong steps to assure the integrity of their data.nnThe next generation of web spam is already being developed – video spam. Both Demand Media and Aol are now generating vast amounts of junk video content. Much of this junk is hosted on YouTube, so finding the source of the junk is difficult. Existing search engines don’t deal with video at all well.nnThis is part of a general trend of exploiting major “Web 2.0″ and “social” sites to host spam content, at no cost to the spammer. It will soon be necessary for search engines to down-rate all content with commercial intent appearing on social networking sites.nnDon’t expect “crowdsourcing” to help. Recommendation systems are already choked with phony recommendations. Any free recommendation system which gets enough traffic to matter will be spammed. The recommendation systems which work are ones directly involved in transactions, such as eBay and Amazon. They can tell who actually purchased the product or service. So can Groupon, which is food for thought about who might be a power in search in the future.nnJohn NaglenSiteTruthnnagle@sitetruth.comn

  • Pingback: Tweets that mention How Google Ambushed Microsoft and Changed the Subject -- Topsy.com()

  • Gianluca Spadini

    # Who really owns the data that Google and Bing are tussling over?n Who owns the air I am exhaling ?nnWhy do Google and Microsoft believe that they own our information?n Where did you get this ? I do not recall seeing anywhere statement from Google or MS that they OWN our information.nnAnd why arenu2019t they paying us for using this?n When you enter, for example, your birth date into the dentist questionnaire, why don’t you ask for money ??nnDoesnu2019t this give a huge, long-term, advantage to Bing, which can (within limits)?n Facebook contains certain specific information, which does not necessary mean that search for , for example, Maxwell equations, would make any difference.nnWill Google and Microsoft take similar steps?n Am I going to pee in the next 10 minutes ? Should we ask the whole internet audience this question ? Why don’t you ask them (Google, MS)nnWhy doesnu2019t Google, as the market leader, work with its competitors to create an open measure that can be used by everyone?n I think the users are voting with their mouse clicks. nn Let Google prove to us that it is, indeed, better than the rest.n Do you think that users are dummies ? Don’t you think that users know what is giving them best answer ? Why should GOOGLE (or anybody else) start the pissing contents ? Google does NOT prevent the users to switch their search engine (like MS did with their secret file format for WORD doc)nnnSUMMARY: a lot of words in your article/commentary/…… no real substance… in short VAPOR WARE

  • Ronald

    Are you sure you are talking to the right people about the future of search.nThey seem to like something like this:nHell with Rulesnhttp://jeffjonas.typepad.com/jeff_jonas/2010/07/hell-with-rules.htmlnnAlso I would ask the question what is the Information Value of any given article instead of quality. One can arrive at the Information Value based on Information I don’t know how to get to quality.