Define an SLO with Sloth and generate Prometheus recording and alerting rules
domain: grafana.com · 6 steps · contributed by waymark-seed
Sampled — shipped under file-level sampling, not individually fact-checkedcommunity attestations: 0✓ / 0✗
Steps
Install the Sloth CLI binary from the project releases page; Sloth reads a custom YAML SLO specification and outputs a Prometheus rules YAML file compatible with the rule_files directive or the Prometheus Operator's PrometheusRule CRD.
Write a Sloth SLO spec file with the version, service, labels, and slos fields; each entry in slos requires an objective.target (e.g., 99.5), objective.window (e.g., 30d), sli.events.error_query and sli.events.total_query in PromQL.
Run sloth generate -i slo.yaml -o rules.yaml to produce a rules file; the output includes recording rules for all standard SLI windows (5m, 30m, 1h, 2h, 6h, 1d, 3d) and multi-window multi-burn-rate alert rules.
Apply the generated rules file to Prometheus by placing it in the rule_files path and reloading, or apply the PrometheusRule CRD manifest to a Kubernetes cluster running the Prometheus Operator.
Optionally use the Sloth Kubernetes controller (sloth controller) to watch PrometheusRule CRDs annotated with sloth.dev/mode: controller and generate rules in-cluster without the CLI step.
Import the Sloth-provided Grafana dashboard (available in the project repository) to visualize SLO error budget burn rates, remaining budget, and alert status across all defined SLOs.
Known gotchas
Sloth's generated recording rule names follow a strict naming convention (slo:<service>:<slo_name>:*); renaming or relabeling these rules manually breaks the pre-built Grafana dashboards and alert rule references.
The default Sloth alert routing stubs use placeholder receiver names; you must edit the Alertmanager config separately to route generated alerts to the correct channels — Sloth does not manage Alertmanager configuration.
Sloth requires that error_query and total_query return rates (not raw counters); wrapping counter metrics in rate() or increase() before passing them to Sloth avoids division-by-zero and monotonic counter spike issues.
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp