Create a plain-text Markdown file at 'https://yourdomain.com/llms.txt' listing your site's key documentation pages, API references, and high-signal content with brief descriptions
Structure the file with a short H1 title, a brief site description paragraph, and then H2 sections grouping related pages — each entry is a Markdown link with a one-line description
Optionally create 'llms-full.txt' containing the full text of all important pages concatenated, for AI tools that prefer a single context-window-friendly document
Do not use llms.txt as a crawler control mechanism — it cannot restrict any crawler, cannot opt your site out of AI training, and has no enforcement mechanism
Verify adoption before relying on it: as of mid-2026, no major AI company has publicly committed to production support; practical value is primarily in developer tooling (Cursor, GitHub Copilot) that fetches docs in real time
Combine llms.txt with robots.txt user-agent blocks (GPTBot, ClaudeBot, CCBot) if you want to restrict AI crawling — llms.txt alone provides no restriction
Known gotchas
llms.txt is not a web standard and has no formal specification body — the proposal originated from Answer.AI and adoption is voluntary with no enforcement; treat it as a best-effort signal, not a reliable control mechanism
Placing sensitive URLs in llms.txt to 'guide' AI tools could inadvertently surface those URLs to any tool that reads the file — only include content you intend to be publicly indexed
llms.txt and robots.txt serve entirely different purposes and neither overrides the other; a URL listed in llms.txt is not automatically crawlable if blocked in robots.txt, and vice versa
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp