

News Archive
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
Google Explains Recrawling for Updated News
January 25, 2010, 2:56 pmGoogle has a post up on the Google News blog today talking a little bit about how it recrawls news content in order to provide the most up to date content and eliminate dead links.
"How do you balance looking for new content against the need to update older content? How can you make sure the content is fresh, doesn't link to dead pages or display headlines that have been changed by the publisher?" asks Google.
Google's answer is that it has implemented a recrawl feature that lets it focus on getting the newest content, while displaying the most current version of older content. After Google News discovers an article, it will continue to crawl it repeatedly to look for changes. In the first day, it will actually recrawl it more frequently, because as the company says, the most changes are usually made to news stories soon after they're published.
"In some cases, we'll even revisit articles we had trouble crawling the first time around," says Google. "After that, we visit them less often. Either way, we try hard to present users with the freshest news. (We bet whoever wrote "Dewey Defeats Truman" wishes they had recrawl!)."
Google says the feature is intended to reduce the number of outdated headlines and dead links, and for publishers, it will provide assurance that Google will index the latest stories and updates as soon as possible.
Related Articles:
> Google Changes How it Handles Paid Content
> Minds of the Media Gather to Discuss Future of News
> Google Okay With Blocking News Corp.




