

News Archive
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
Do You Need Google et al.? Hacker News Doesn’t Does
March 16, 2010, 10:11 am
Hang around any industry conference, forum or blog long enough and you’ll find someone lamenting our dependence on Google, or search engines altogether. It’s absolutely true that we as webmasters and marketers need to diversify our traffic strategies (you know what they say about eggs and baskets)—but are you willing to take the step to block all search engines from your site?
Hacker News was—at least for a little while. At news.ycombinator.com recently, the robots.txt file was changed to disallow all crawling from search engines, as theNextWeb reports. However, Paul G. at Hacker News quickly explained:
Don’t worry, it doesn’t mean anything. The software for ranking applications runs on the same server, and it is horribly inefficient (something 4 people use every 6 months doesn’t tend to get optimized much). This weekend all of us were reading applications at the same time, and the system was getting so slow that I banned crawlers for a bit to buy us some margin. (Traffic from crawlers is much more expensive for us than traffic from human users, because it interacts badly with lazy item loading.) We only finished reading applications an hour before I had to leave for SXSW, so I forgot to set robots.txt back to the normal one, but I just did now.
There’s nothing wrong with that (though you’d hope you wouldn’t forget that kind of thing!). Rather than the User-agent: * Disallow: / theNextWeb spotted, Hacker News’s robots.txt now only disallows all user agents to five selected paths.
Can you ban all search engines (on purpose and for the long term)? Sure—that’s what robots.txt is for (I’m looking at you, newspaper sites who claim Google’s stealing your bacon content). Some people do it just to keep search engines out; others do it to force themselves to develop other traffic streams. But if you do it, be sure to actually work on those other traffic streams, and to have a good on-site search capability.
What do you think? Would you ever block all search engines, for any reason?
Join the Marketing Pilgrim Facebook Community





