Create a standard or FIFO SQS queue; create a separate dead-letter queue (DLQ) of the same type and configure a redrive policy on the source queue specifying the DLQ ARN and maxReceiveCount (typically 3–5)
Set the queue's VisibilityTimeout to safely exceed your worst-case processing time; if processing can vary, extend visibility mid-flight using ChangeMessageVisibility before the timeout expires
Poll using ReceiveMessage with WaitTimeSeconds up to 20 (long polling) to reduce empty-receive API calls and cost; receive up to 10 messages per call (MaxNumberOfMessages=10)
Process each message and delete it via DeleteMessage using the ReceiptHandle only after successful processing — never delete before confirming success; undeleted messages become visible again after the visibility timeout
Implement idempotency keyed on the message MessageId or a business-level deduplication key stored in a database or ElastiCache; SQS standard queues deliver at least once and occasionally duplicate
Monitor ApproximateNumberOfMessagesNotVisible (in-flight), ApproximateAgeOfOldestMessage, and the DLQ depth in CloudWatch; alert on DLQ depth greater than 0
Known gotchas
The visibility timeout is per-message, not per-consumer — if your consumer crashes mid-processing, the message reappears after the timeout and will be reprocessed; this is expected behavior, not a bug, so idempotency is non-negotiable
FIFO queues deduplicate within a 5-minute deduplication window using the MessageDeduplicationId — sending the same ID twice in that window silently drops the second message; standard queues have no such deduplication
Long polling (WaitTimeSeconds > 0) is almost always better than short polling; short polling samples only a subset of SQS servers and can return empty responses even when messages exist
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp