This semester, my students at the School of Information at UC-Berkeley researched the VC system from the perspective of company founders. We prepared a detailed survey; randomly selected 500 companies from a venture database; and set out to contact the founders. Thanks to Reid Hoffman, we were able to get premium access to LinkedIn—which was very helpful and provided a wealth of information.  But some of the founders didn’t have LinkedIn accounts, and others didn’t respond to our LinkedIn “inmails”. So I instructed my students to use Google searches to research each founder’s work history, by year, and to track him or her down in that way.

But it turns out that you can’t easily do such searches in Google any more. Google has become a jungle: a tropical paradise for spammers and marketers. Almost every search takes you to websites that want you to click on links that make them money, or to sponsored sites that make Google money. There’s no way to do a meaningful chronological search.

We ended up using instead a web-search tool called Blekko. It’s a new technology and is far from perfect; but it is innovative and fills the vacuum of competition with Google (and Bing).

Blekko was founded in 2007 by Rich Skrenta, Tom Annau, Mike Markson, and a bunch of former Google and Yahoo engineers. Previously, Skrenta had built Topix and what has become Netscape’s Open Directory Project. For Blekko, his team has created a new distributed computing platform to crawl the web and create search indices. Blekko is backed by notable angels, including Ron Conway, Marc Andreessen, Jeff Clavier, and Mike Maples. It has received a total of $24 million in venture funding, including $14M from U.S. Venture Partners and CMEA capital.

In addition to providing regular search capabilities like Google’s, Blekko allows you to define what it calls “slashtags” and filter the information you retrieve according to your own criteria. Slashtags are mostly human-curated sets of websites built around a specific topic, such as healthfinancesportstech, and colleges.  So if you are looking for information about swine flu, you can add “/health” to your query and search only the top 70 or so relevant health sites rather than tens of thousands spam sites.  Blekko crowdsources the editorial judgment for what should and should not be in a slashtag, as Wikipedia does.  One Blekko user created a slashtag for 2100 college websites.  So anyone can do a targeted search for all the schools offering courses in molecular biology, for example. Most searches are like this—they can be restricted to a few thousand relevant sites. The results become much more relevant and trustworthy when you can filter out all the garbage.

The feature that I’ve found most useful is the ability to order search results.  If you are doing searches by date, as my students were, Blekko allows you to add the slashtag “/date” to the end of your query and retrieve information in a chronological fashion. Google does provide an option to search within a date range, but these are the dates when website was indexed rather than created; which means the results are practically useless. Blekko makes an effort to index the page by the date on which it was actually created (by analyzing other information embedded in its HTML).  So if I want to search for articles that mention my name, I can do a regular search; sort the results chronologically; limit them to tech blog sites or to any blog sites for a particular year; and perhaps find any references related to the subject of economics. Try doing any of this in Google or Bing

The problem is that content on the internet is growing exponentially and the vast majority of this content is spam. This is created by unscrupulous companies that know how to manipulate Google’s page-ranking systems to get their websites listed at the top of your search results. When you visit these sites, they take you to the websites of other companies that want to sell you their goods. (The spammers get paid for every click.) This is exactly what blogger Paul Kedrosky found when trying to buy a dishwasher. He wrote about how he began Googleing for information…and Googleing…and Googleing. He couldn’t make head or tail of the results. Paul concluded that the “the entire web is spam when it comes to major appliance reviews”.

Unfortunately, it isn’t just appliance reviews that are the problem. Almost any popular search term will take you into seedy neighborhoods.

Content creation is big business, and there are big players involved. For example, Associated Content, which produces 10,000 new articles per month, was purchased by Yahoo! for $100 million, in 2010. Demand Media has 8,000 writers who produce 180,000 new articles each month. It generated more than $200 million in revenue in 2009 and planning an initial public offering valued at about $1.5 billion. This content is what ends up as the landfill in the garbage websites that you find all over the web. And these are the first results that show up in your Google search results.

The bottom line is that we’re fighting a losing battle for the web and need alternative ways of finding the information that we need. I hope that Blekko and a new breed of startups fill this void: that they do to Google what Google did to the web in the late 90’s—clean up the spam and clutter.

  • http://www.facebook.com/people/Rob-Gubatan/100003677626864 Rob Gubatan

    This
    is a great blog post – I enjoyed reading it & gained a lot – on a side
    note, I am turning big 40 – yes yes ! I know getting old but that’s a part of
    life. I am actually quite blessed with a good family & very obedient
    kids; anyways – as the 40 is hitting I am realizing that I have not done a
    great job with my retirement planning. One of my wife’s cousin is a an agent
    with  Bankers Life and Casualty
      so I reached out to him
    over the last weekend – it seems that they have great products from life
    insurance to annuities & they work with individuals to provide great
    service & plan for the retirement. Does anyone here had any experience
    &/or know any other companies whom I should checkout before signing up
    with Bankers Life and Casualty Company. Any feedbacks will a great help –
    just FYI – I am planning to retire at/around 68 yrs of the age.

  • LESTIDWELL

    This
    is a great blog post – I enjoyed reading it & gained a lot – on a side
    note, I am turning big 40 – yes yes ! I know getting old but that’s a part of
    life. I am actually quite blessed with a good family & very obedient
    kids; anyways – as the 40 is hitting I am realizing that I have not done a
    great job with my retirement planning. One of my wife’s cousin is a an agent
    with  Bankers Life and Casualty
      so I reached out to him
    over the last weekend – it seems that they have great products from life
    insurance to annuities & they work with individuals to provide great
    service & plan for the retirement. Does anyone here had any experience
    &/or know any other companies whom I should checkout before signing up
    with Bankers Life and Casualty Company. Any feedbacks will a great help –
    just FYI – I am planning to retire at/around 68 yrs of the age.

  • Pingback: What if google is being disrupted before our eyes? | Collecting Thoughts()

  • Mital

    This is a great article, even I felt this since a long time, but you gave words to my thoughts. I remember, since SEO became a business, and tons of microblogging and RTs have spoiled the organized data and made casual searches useless. 

    Blekko seems to be a good option. Shows less micro-blogging results.

    Thanks for highlighting.

  • Pingback: What I want in my new Google « The Berkeley Blog()

  • Pingback: What I Want in My New Google | Tech stuff center()

  • Pingback: What I Want in My New Google Balakrishnan V K - Balakrishnan V K()

  • Pingback: Oh, Sugar! » What I Want in My New Google()

  • Pingback: What I Want in My New Google | Ebay shopping tips()

  • Pingback: Technology Global » What I Want in My New Google()

  • Pingback: What I Want in My New Google | Blogburger()

  • Pingback: What I Want in My New Google | BrettMBell.com()

  • Pingback: Web is One Giant Heap of Trash – Vivek Wadhwa | Web Wanderings()

  • Pingback: Ne Benim Yeni Google İstiyor()

  • Pingback: What I Want in My New Google | JetLib News()

  • Pingback: Are SEOs the New Telemarketers? | since wen – by Kristy Wen()

  • Anonymous

    I came across this article through LiveSide.net and am pleased I did. I’ve long found it annoying you cannot order search results by date. The only thing that properly comes close (i.e – creation date) is through Google News but that only searches “News” articles, making it less than effective. Like you say the date search in the main Google Search is pretty useless.

    When trying out new search engines (Bing included) I always end up back at Google because it integrates so well with Mail, Calendar, Contacts, Android etc… Even Bing, whilst having some aspects better than Google (Bing Maps for example) is mostly just “as good” as Google. Blekko however with it’s slash tags approach makes it very powerful.

    Thanks again. One to bookmark!

  • Fred

    I’m creating a startup to help unclutter this mess. Basically it will be a crowdsourced Seo marketplace. Thoughts?

    • http://www.wadhwa.com Vivek Wadhwa

      From what I know (and I am not an expert in this), this seems like a very good idea. Best to speak to others in this space.

  • araichura

    Great piece of article and you hit the right

  • bill ford

    http://ford321.posterous.com/tag/justenoughnnI agree with your concerns about Google and I have been designing an answer to the problem. What is driving down the quality of services and user experiences is the fact that this generation of apps, Facebook and Google, is based on the free and easy exploitation of user contacts and profiles which are resold in a marketplace to which users have no access. I have designed an alternative service based on users managing their own profile sales and purchases. I have sketched the concept out at the link above. What do you think? thx, -bill

    • http://www.wadhwa.com Vivek Wadhwa

      Bill, I don’t know the answer. You have given this solution far more thought than I have.

  • Poet

    I write poetry. I have the number one spot on two Google verts. In the old days, these verts were useless for promoting poems, because every link was to a spam site. The sites themselves had no use to them. Google fixed that. At least you can find some poetry sites on these verts. I once even tried to rally other poets in my two verts to complain to Google so Google would remove spam sites, so that these legit poets would have higher ranks in the results. Even though I already had the no. 1 spots, and it wouldn’t affect me.nnSo I checked out Blekko. I searched on the vert that gets me the most traffic. My site was not no. 1. It was no. 22. Just one site before my site was an individual’s site. Almost everything else was garbage, including what I could tell, without clicking on the links, were probably virus-laden fake domains (and a site that had stolen one of my poems). Well, Google has been clamping down on those kind of fake domains.nnI checked well past my domain, too. Links to dating sites, Amazon books, and more garbage.nnSure, I could join Blekko and try to fix some of this. It might be in my best interest, too, as a webmaster. But Blekko returned 3 million hits for my vert. I’m supposed to personally filter out that many links? Yeah, right.

    • http://www.wadhwa.com Vivek Wadhwa

      I agree that Blekko is far from perfect. The reason I highlighted them was to show that there are alternative way’s to search and that these need to be encouraged. Google needs competition–badly.

      • Anonymous

        Google needs competition very badly. Competition breeds innovation and Google’s “Search” innovations have been rather lacklustre of late. They’ve been concentrating too much on other areas of their business.

  • http://ted-strauss.myopenid.com/ ted

    great piece and right to the point.ncompetition keeps the big guys on their toes, and always benefits us consumers.nit looks like 2011 will be the first time Google gets to play defense. I can’t wait to watch it all play out.

  • JimC

    We make this unspoken assumption that “everything” is available on the Internet. I guess I would like a scholar like Prof. Wadhwa to explore that. What information isn’t available on the internet? I think we see here, that of the information that is available, much of it is unreliable, if not merely folk-lore. Can we have an Internet of ideas, not just commerce? I’m not sure we want truth crowd-sourced…

    • http://www.wadhwa.com Vivek Wadhwa

      That would be a great study indeed. I hope someone does it…