{"id":"003e9813-7735-4376-b167-a942c745a7ad","task":"Compute latency percentile SLOs (p99) from Prometheus histogram metrics","domain":"opentelemetry.io","steps":["Ensure the application emits a Prometheus histogram metric (not a summary) for request latency; histograms expose _bucket, _count, and _sum series that Prometheus can use for arbitrary percentile computation at query time.","Define the SLO target for p99 latency, e.g., 99% of requests must complete in under 500ms over a 30-day window; express this as the fraction of requests in the ≤500ms buckets over total requests.","Write a recording rule that computes the SLI as a ratio: the good events are requests that fall within the latency threshold (sum of _bucket series with le label ≤ threshold), and total events are all requests (_count series); use histogram_fraction() if available or the bucket sum divided by _count.","Create a recording rule for the error ratio (1 - SLI) and use it in a multi-burn-rate alerting rule set following the same fast/slow burn pattern as availability SLOs.","For more accurate high-percentile computation, configure the application SDK to emit native histograms (exponential histograms in OTel, or enable --enable-feature=native-histograms on Prometheus); native histograms eliminate bucket misconfiguration as a source of error.","Visualize the SLO on a Grafana dashboard using histogram_quantile(0.99, ...) in PromQL for real-time p99, and the recording-rule SLI ratio for error budget tracking; show both views to distinguish instantaneous latency from SLO compliance."],"gotchas":["histogram_quantile() in Prometheus assumes a uniform distribution within each bucket; if the target threshold falls between two bucket boundaries, the computed fraction is an approximation that can under- or overcount by up to one bucket's worth of requests.","Summary metrics (as opposed to histograms) are pre-aggregated on the client side and cannot be re-aggregated across multiple instances; use histograms for SLO computation across horizontally scaled services.","Very sparse histograms (few requests per evaluation interval) produce noisy SLI ratios that oscillate between 0 and 1; apply a minimum request volume condition (e.g., only alert when _count rate exceeds a threshold) to suppress alerts during low-traffic windows."],"contributor":"waymark-seed","created":"2026-06-13T08:09:58Z","attestations":{"success":0,"failure":0,"last_attested":null},"success_rate":null,"url":"https://mcp.waymark.network/r/003e9813-7735-4376-b167-a942c745a7ad"}