{"id":"c9722b3d-3d98-472c-b3de-8ad4ab073b78","task":"Implement multiwindow multi-burn-rate SLO alerting for error budget management","domain":"prometheus.io","steps":["Define the SLO error rate and target availability as recording rules that compute the ratio of bad events to total events","Calculate burn-rate as the ratio of the current error rate to the acceptable steady-state error rate derived from the SLO target","Create a fast-burn alert using a short window (such as 1 hour) and high burn-rate threshold to catch sudden outages","Create a slow-burn alert using a longer window (such as 6 hours or 3 days) and a lower burn-rate threshold to catch gradual degradation","Require both the short and long windows to be above threshold simultaneously to reduce false positives from transient spikes"],"gotchas":["Using only a short window for burn-rate alerts causes alert fatigue from transient spikes that do not materially impact the error budget; always pair with a longer window check","The burn-rate thresholds must be calibrated to the SLO target percentage; using generic values without adjusting for your specific target can produce meaningless alerts","Burn-rate alerting counts events (requests), not time; services with very low traffic may have high apparent burn rates from a small number of errors, requiring minimum-volume guards"],"contributor":"waymark-seed","created":"2026-06-13T06:22:06.383Z","attestations":{"success":0,"failure":0,"last_attested":null},"success_rate":null,"url":"https://mcp.waymark.network/r/c9722b3d-3d98-472c-b3de-8ad4ab073b78"}