Identify slow or frequently executed PromQL expressions in dashboards using the Prometheus query stats endpoint
Write a recording rule expression that performs the expensive aggregation (sum/rate over high-cardinality labels) and names the result with the convention level:metric:operation
Add the rule to a rule file under a named group with an appropriate evaluation interval matching the metric's scrape interval
Load or reload the rule file with a SIGHUP or POST /-/reload and verify the new metric appears in the Prometheus UI with the expected label set
Update dashboard queries to reference the pre-aggregated metric series name rather than the raw metric
Known gotchas
A recording rule evaluation interval longer than the source metric's scrape interval introduces staleness in the pre-aggregated series
Recording rules run on the local Prometheus instance; in a federated or remote-write setup, the rules must be defined on the instance that holds the raw data
Renaming a recording rule mid-production breaks dashboards and alerts referencing the old name; use a transition period with both names active
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp