Fetch the product detail page HTML and extract all <script type='application/ld+json'> blocks; parse each block as JSON and filter for objects whose '@type' equals 'Product' or contains 'Product' in an array.
Fall back to Microdata extraction (itemtype='https://schema.org/Product') if JSON-LD is absent; walk itemprop attributes to reconstruct the same field set.
Normalize the offers field: it may be a single Offer object or an array of Offer objects representing variants; iterate all and pick the lowest or most relevant price for your use case.
Map schema.org availability URIs (e.g., 'https://schema.org/InStock', 'https://schema.org/OutOfStock') to your internal enum so downstream logic is decoupled from raw URI strings.
Cache the extracted structured data with a TTL derived from the page's Cache-Control header to avoid re-fetching on every agent loop iteration.
Known gotchas
Many sites embed malformed JSON-LD (trailing commas, unescaped characters); use a lenient JSON parser or sanitize the block before parsing and log parse failures for debugging.
Schema.org markup on the page may be stale or differ from the canonical API data; always treat it as a best-effort signal and confirm critical fields (price, stock) via a merchant API when one is available.
Some merchants populate the markup with placeholder or templated values that don't reflect the actual selected variant; check if the page uses JavaScript-driven variant selection and whether you need a headless browser to get accurate values.
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp