Fetch the hospital's root-level TXT or links.json index file (linked from the hospital homepage footer) to discover the URL of the current MRF file.
Stream-download the MRF — files frequently exceed 1 GB — using a streaming HTTP client rather than loading the entire payload into memory; decompress gzip or zip on the fly if Content-Encoding indicates compression.
Use a streaming JSON parser to iterate over each charge object, extracting fields such as the code, code type, payer name, plan name, charge type, negotiated rate or percentage, and where applicable the median allowed amount and percentile fields.
Normalize the extracted rows into a staging table with one row per code-payer-plan combination, casting numeric fields to DECIMAL with sufficient precision.
Deduplicate on (hospital_id, code, code_type, payer_name, plan_name) because some MRFs repeat rows across billing code aliases.
Schedule a monthly refresh that re-fetches the index file to pick up newly posted MRFs; compare the last_updated date in the file header to skip unchanged files.
Known gotchas
Many hospitals publish MRFs that exceed 5 GB uncompressed; iterative DOM-based JSON parsers will crash — use a streaming parser such as ijson (Python) or a SAX-style equivalent.
MRF URLs change when files are regenerated; always resolve the current URL from the index file rather than hardcoding the MRF URL.
Percentage-based rates require the corresponding median allowed amount fields to interpret actual dollar values; downstream analytics that ignore these fields will undercount real consumer cost exposure.
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp