Scale OpenTelemetry Collector deployments using the loadbalancingexporter to route traces from gateway collectors to tail-sampling backends by trace ID
In a two-tier Collector architecture, configure the first tier (gateway/load-balancer) to use the loadbalancingexporter from opentelemetry-collector-contrib
Set routing_key: traceID in the loadbalancing exporter config so that all spans sharing a trace ID are consistently forwarded to the same second-tier Collector instance
Configure the resolver to either static (for fixed backend lists), dns (for headless Kubernetes service DNS round-robin), or k8s_service (for Kubernetes service endpoint enumeration) depending on your deployment model
Point the exporter's protocol to otlp with the backend Collector OTLP receiver address; the loadbalancing exporter uses consistent hashing so adding or removing backends only remaps a fraction of traces
Configure the second-tier Collectors with the tailsamplingprocessor; they now receive all spans for each trace they are responsible for, which is required for accurate tail-based sampling decisions
Monitor the otelcol_loadbalancer_backend_latency and otelcol_loadbalancer_num_resolutions metrics on the gateway tier to detect uneven distribution or backend failures
Known gotchas
The loadbalancing exporter is in the contrib repository, not the core collector; ensure you use an otelcol-contrib distribution or build a custom collector image that includes this exporter
DNS resolver queries are cached; when backend pods scale up or down there is a brief window where the resolver still uses a stale endpoint list, potentially sending some spans to wrong backends — tail-sampling decisions during this window may be incomplete
Using service as routing_key routes all spans from the same service to the same backend, which is useful for metrics aggregation but not for tail sampling, which needs all spans of the same trace together
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp