Keep each sitemap file within the documented limits: no more than 50,000 URLs per sitemap file and no larger than 50MB uncompressed
For sites exceeding these limits, create a sitemap index file (sitemapindex XML element) that references multiple child sitemap files using sitemap and loc elements
Set lastmod values to the accurate ISO 8601 last-modified date of the page content; derive this from your CMS or database, not from the sitemap generation timestamp
Compress sitemap files with gzip to reduce file size and serve them with the correct Content-Type header; submit the .xml.gz URL directly to Search Console
Submit the sitemap index URL (not individual child files) to Search Console so Google tracks all child sitemaps under one submission
Known gotchas
Setting lastmod to today's date on every sitemap regeneration, regardless of whether content changed, trains crawlers to distrust your lastmod values and effectively ignore them
A sitemap only tells crawlers that URLs exist; URLs not linked from any other page and listed only in a sitemap may still receive low crawl priority
Sitemap files must be accessible without authentication and must not redirect through login walls; a sitemap that returns a redirect to a login page will be reported as an error in Search Console
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp