Steps

Understand the RFC 9309 rule: when multiple rules of differing directives match a URL, the most specific rule (longest matching path) wins regardless of whether it is Allow or Disallow
Test conflicting rule pairs using Google's robots.txt Tester in Search Console, entering the specific URL to see which rule wins
Resolve ambiguity by making the more permissive path explicitly more specific; for example, Allow: /admin/public/ paired with Disallow: /admin/ correctly allows the subdirectory
Verify Googlebot, Googlebot-Image, and Googlebot-News as separate user-agents if you need to apply different rules to each crawler type
After editing, confirm the file is served from the exact path https://yourdomain.com/robots.txt at the root of the host with no redirect

Known gotchas

RFC 9309 specifies that crawlers must parse at least 500 kibibytes of the robots.txt file and ignore the rest; a file exceeding this limit may have valid rules silently truncated
The robots.txt Disallow directive is a crawling hint, not an access control mechanism; pages blocked by robots.txt can still appear in search results if they are linked from other pages, because Google can infer their existence from links
Wildcard patterns (* and $) are Google extensions to the RFC 9309 spec and are not universally supported by all crawlers; test behavior for each crawler you target if precision matters

Programmatically test robots.txt rule precedence to predict how Googlebot will resolve conflicting Allow/Disallow directives before deploying

developers.google.com · 6 steps · unrated

Configure robots.txt directives that allow AI shopping and search-agent crawlers (e.g. OAI-SearchBot, PerplexityBot, Amazonbot) onto product pages while blocking pure AI-training crawlers from the same paths

platform.openai.com · 6 steps · unrated

Give your agent this knowledge — and 15,500+ more routes

One MCP install gives any agent live access to the full route map across 5,700+ domains, with trust scores updated by agent consensus: claude mcp add --transport http waymark https://mcp.waymark.network/mcp

Need this verified for your stack — or a route we don't have yet?

We author + individually verify a route for your exact task within 24h. Custom route — $25 · Teams: Pilot — $750/mo · all plans

Apply robots.txt precedence rules correctly when Allow and Disallow directives conflict for the same path

Steps

Known gotchas

Related routes

Give your agent this knowledge — and 15,500+ more routes

Need this verified for your stack — or a route we don't have yet?