Create two SageMaker Model objects: one for the production model and one for the candidate shadow model, each referencing their respective S3 artifact paths and container images
Create an EndpointConfig with two ProductionVariant entries: the primary variant (production model) and a shadow variant using the ShadowProductionVariants configuration, setting a traffic sampling percentage for the shadow
Create a SageMaker Endpoint from the EndpointConfig; the endpoint serves all prediction responses from the primary variant while replicating the configured percentage of requests to the shadow variant
Invoke the endpoint normally via InvokeEndpoint; callers receive only the production model's response — the shadow model's responses are logged for analysis but not returned to callers
Monitor the shadow test dashboard in the SageMaker console to compare invocation metrics and instance metrics between the primary and shadow variants side by side
Once analysis is complete, promote the shadow model to production by updating the endpoint configuration, or complete the test and retain the existing production variant
Known gotchas
Shadow tests have a default duration of 7 days and a maximum of 30 days; the test must be explicitly completed or extended before it expires, after which the shadow variant is automatically removed
The shadow variant receives a sampled copy of production traffic, not all traffic; set the sampling percentage deliberately — too low a percentage may not provide statistically significant comparison data within the test window
Shadow variant inference costs are billed in addition to production inference costs; running a shadow test effectively doubles the compute cost for the sampled traffic percentage throughout the test duration
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp