After navigating to a page, call const snapshot = await page.ariaSnapshot() to get a YAML string representing the accessibility tree
Pass the snapshot as a message to your LLM (e.g. as the user turn content): { role: 'user', content: 'Current page state:\n' + snapshot + '\nWhat action should I take next?' }
Parse the LLM's response to extract a structured action (e.g. { action: 'click', label: 'Submit' })
Locate the element and execute the action: await page.getByRole('button', { name: 'Submit' }).click()
Re-snapshot after each action: const next = await page.ariaSnapshot() and feed back into the LLM for the next step
Terminate the loop when the LLM returns a 'done' signal or a maximum step count is reached
Known gotchas
ARIA snapshots omit purely decorative or visually-positioned elements; agents that rely on visual layout cues (e.g. 'click the button on the right') need screenshot augmentation alongside the snapshot
Large SPAs with deeply nested components can produce snapshots exceeding 2,000 tokens; scope snapshots to a specific locator with locator.ariaSnapshot() to reduce size
The snapshot reflects the live DOM state at the moment of the call; dynamic content that loads asynchronously after the initial render may not appear — wait for expected elements before snapshotting
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp