List all logical replication slots and their WAL retention: SELECT slot_name, active, confirmed_flush_lsn, pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn) AS lag_bytes, wal_status FROM pg_replication_slots WHERE slot_type = 'logical'
Alert if lag_bytes exceeds a threshold (e.g. 5 GB) — an inactive or stalled slot retains all WAL since confirmed_flush_lsn, which can fill the pg_wal directory
Set max_slot_wal_keep_size in postgresql.conf (e.g. max_slot_wal_keep_size = 10GB) to cap WAL retained per slot; when the cap is reached PostgreSQL invalidates the slot rather than filling the disk
Drop unused or stalled slots: SELECT pg_drop_replication_slot('stale_slot_name') — verify the consuming application (e.g. Debezium) is stopped before dropping to avoid data loss
Check wal_status column in pg_replication_slots: 'reserved' means WAL is being retained, 'lost' means the slot has been invalidated and must be recreated
In PostgreSQL 18+, set idle_replication_slot_timeout to automatically invalidate slots that have been inactive beyond a duration, reducing manual monitoring burden
Known gotchas
An invalidated slot (wal_status = 'lost') cannot be recovered — the consumer (Debezium, pglogical, etc.) must perform a full snapshot to resync; avoid invalidation by setting appropriate max_slot_wal_keep_size and alerting early
max_slot_wal_keep_size defaults to -1 (unlimited) on most PostgreSQL installations — without setting this parameter a stalled consumer can silently fill the disk and crash the server
Logical slots on a primary server are not automatically transferred to a standby after failover; failover tools (e.g. Patroni) must handle slot re-creation on the new primary
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp