Here's our code for generating a Google Sitemap (couldn't find anything about this in the archives):
code/SiteMap.php:
<?
class SiteMap extends SiteTree {
static $db = array(
);
static $has_one = array(
);
function SiteTreeList() {
$pages = DataObject::get("SiteTree", "ShowInSearch = 1");
return $pages;
}
}
class SiteMap_Controller extends ContentController {
}
?>
templates/SiteMap.ss:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<% control SiteTreeList %>
<url>
<loc>$AbsoluteLink</loc>
</url>
<% end_control %>
</urlset>
Rebuild your DB, and then add a new SiteMap page in the site root with a URL of "sitemap" (title and content are not used). Remember to hide it in menus and search on the behaviour tab.
Your sitemap should now be visible as "/sitemap/", but Google doesn't like this as it's a directory so I add a rule to my .htaccess:
...
# Google SiteMap
RewriteRule sitemap.xml sitemap/ [NC,PT]
RewriteCond %{REQUEST_URI} !(\.gif)|(\.jpg)|(\.png)|(\.css)|(\.js)|(\.php)$
...
to make it available as "sitemap.xml"
Finally to finish off, you can add a Sitemap line to your robots.txt file to allow search engines to discover your sitemap automatically:
User-agent: *
Disallow:
Sitemap: http://www.examplesite.com/sitemap.xml