RobotsTxt, robots.txt and /robots.txt

The RobotsTxt module effectively requires a hack to Drupal core in order to function. This is because core Drupal contains a static robots.txt file, and webservers like Apache are configured by Drupal’s .htaccess file to serve static files preferentially to asking Drupal for the content. Every time Drupal is upgraded (or you deploy the site to a new staging or development instance), that hack has to be repeated.

One solution which Johan hit upon a while ago was to patch Drupal’s core .htaccess file, to send any request beginning /robots.txt to Drupal rather than serving any file on disk. It still constitutes a hack to core, but as a patch file accessible at a URL it can be incorporated into e.g. drush make files, and applied automatically when Drupal core is upgraded.

The patchfile-based hack is a big improvement, but it still leaves the RobotsTxt module complaining that there’s a problem. This is because it checks for the robots.txt file on disk, not for the /robots.txt URL, which is the real acid test for whether the module is working properly. So now we’ve also added this patch for robotstxt_requirements(), which checks the URL instead.