Deploy gateway Collectors as a Kubernetes Deployment with at least 2 replicas spread across nodes using topology spread constraints (topologyKey: kubernetes.io/hostname) to avoid single-node failure taking down the entire tier
Front the Deployment with a Service; use an L7 (gRPC-aware) load balancer or configure the Service as Headless so agent-side gRPC clients can connect to individual pod IPs and benefit from client-side load balancing
Set resource requests and limits based on profiling: use the pprof extension to capture CPU and heap profiles under realistic load before sizing; a common starting point is 1 CPU and 2 GiB memory per gateway pod
Configure a Kubernetes HorizontalPodAutoscaler targeting CPU utilisation or a custom metric such as otelcol_exporter_queue_size to scale out automatically under load spikes
Set PodDisruptionBudgets (minAvailable: 1) so rolling upgrades and node drains never take all gateway pods offline simultaneously
Enable persistent queues (file_storage extension) on agents rather than gateways so agents buffer data locally during gateway rolling restarts, preventing data loss
Known gotchas
Standard L4 load balancers keep a single persistent gRPC connection alive to one backend pod; new agent connections go to that pod, defeating horizontal scale—use an L7 load balancer that understands HTTP/2 stream multiplexing
Gateway Collectors are stateless by design; do not rely on sticky sessions for stateless pipelines, but do use the loadbalancing exporter when a downstream processor (tail sampling) requires trace affinity
Scaling down too aggressively can drop in-flight batches; configure scale-down policies with a stabilisationWindow of at least 5 minutes to prevent thrashing
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp