Pages

Saturday, October 31, 2009

Google's Halloween Joke

When the Web was new, the first search engines were created and listed every page they could find. Often site owners were unhappy with this for a number of reasons:
  • Confidential information could be displayed (Actually an indication of a misconfigured website, but blame the messenger)
  • Inner pages could be displayed bypassing "guard" pages
  • The spiders could consume a lot of (then) expensive bandwidth

AltaVista which was the first popular search engine, designed a special file "robots.txt" for webmasters to include. This file could be used to instruct robots not to index part of a site. Most people never see this file as it is usually uninteresting, but it's often there and creating it is one of the less interesting parts of creating a website

Tonight is Halloween and Google has added these lines to the bottom of their robots.txt file

User-agent: Kids
Disallow: /tricks
Allow: /treats