Thursday, June 23, 2005

Problem with Sitemap plugin

I just noticed a problem with the Sitemap plugin that I wrote.  It looks like Google uses the location of the sitemap file to determine what urls are valid to index.

This causes a problem because pLog is a multi-user blogs software, where all of the urls are generated on the fly. For example all views of the blogs go through index.php.  So if you only multiple blogs to use the sitemap plugin, they will have to be kept at the root of the web directory.

I think that a better solution would be to have Google not use the location of the sitemap file determine what can or can not be crawled.  Maybe using a a Sitemap Index file could be created to point at the other site map files.  Maybe Google would use the location of the index file to determine what can be crawled.

Or even better would be to just follow all of the urls that are in the sitemap file.

