Obtain a free api.data.gov API key at https://api.data.gov/signup — required for the GSA CKAN proxy endpoint
Search datasets: GET https://api.gsa.gov/technology/datagov/v3/action/package_search?api_key={key}&q={keyword}&fq=organization:{agency_name}&rows=20&start=0
Retrieve all datasets for an organization: GET .../package_search?api_key={key}&fq=organization:doi-gov&rows=1000&start=0 — iterate start by 1000 until count is exhausted
Fetch full dataset metadata: GET https://api.gsa.gov/technology/datagov/v3/action/package_show?api_key={key}&id={package_id} — returns resources array with each distribution format, download URL, and media type
Filter to specific data formats by scanning resources[*].format for 'CSV', 'JSON', 'API', or 'GeoJSON'; extract resources[*].url for download or endpoint access
Monitor for new or updated datasets: sort by metadata_modified descending and track the max modified timestamp across polling cycles
Known gotchas
data.gov CKAN contains only dataset metadata, not the actual data files — the resource download URLs link to the hosting agency's servers, which may have separate access controls, rate limits, or broken links independent of the CKAN API
The fq (filter query) parameter uses Solr query syntax — field names are CKAN-internal (organization, tags, res_format, extras_harvest_source_id) and do not always match the field names returned in the JSON response
Dataset organization slugs (e.g., 'doi-gov', 'noaa-gov') must match the exact CKAN organization name; these are not always predictable from the agency acronym and are best discovered via .../organization_list
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp