Set the queue's default VisibilityTimeout to slightly longer than your maximum expected processing time; call ChangeMessageVisibility mid-processing to extend it if needed
Create a separate SQS queue to serve as the dead-letter queue (DLQ), then attach a redrive policy to the source queue: set deadLetterTargetArn and maxReceiveCount
After maxReceiveCount failed receive-and-delete cycles, SQS automatically moves the message to the DLQ
Set the DLQ's message retention period long enough for investigation (up to 14 days) via MessageRetentionPeriod
Use the SQS Redrive Allow Policy on the DLQ to control which source queues are permitted to use it as a DLQ
Known gotchas
VisibilityTimeout maximum is 12 hours; setting it too long means failed messages are invisible to other consumers for a long time before retry
maxReceiveCount counts the number of times ReceiveMessage returns the message, not the number of processing attempts; a consumer that receives but crashes before deleting still increments the count
DLQ must be in the same AWS account and region as the source queue; cross-account DLQs are not supported
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp