17452 Posts in 4473 Topics by 1971 members
|Go to End|
4 March 2008 at 11:43am
Just a quick one I hope. Have started to migrate a SilverStripe installation from a test bench system to a live system and have noticed that search engine requests for the /robots.txt file are being returned a 404 not found in the web server access log files.
With the redirection, rewrites and other things that mangle the request on a SilverStripe installation, what is the best way to handle requests for the robots.txt file and what is a good list of URLs to deny on a SilverStripe installation?
8 March 2008 at 11:44am Last edited: 8 March 2008 11:47am
My research into this:
Placing the robots.txt file in the root directory of the silverstripe installation seems to work.
ie. a ls listing should show something like:
The common Disallows would be
Also a Sitemap directive eg.
Check out: http://www.sitemap.org for more details on this file that SilverStripe automatically generates for you.
So a good starting point might be:
# Show the way to the site map file
# LinkWalker knows where to go.
# Archive.org was hammering site - Now knows where to get off
User Agent: *
|Go to Top|