Steps

Upload all model artifacts to a shared S3 prefix (e.g. s3://bucket/models/) — each model is a separate .tar.gz file under that prefix
Create a SageMaker Model with a multi-model-capable container image (e.g. SageMaker built-in algorithm containers or BYO containers that implement the multi-model server spec)
Set Mode='MultiModel' in the ProductionVariants when creating the endpoint configuration
Invoke a specific model: runtime_client.invoke_endpoint(EndpointName=endpoint_name, TargetModel='model_a.tar.gz', Body=payload, ContentType='text/csv')
SageMaker dynamically loads the requested model into container memory on first invocation and caches it; subsequent calls to the same model skip loading
Handle ModelNotReadyException by retrying — it fires if a large model has not finished loading within the 60-second socket timeout; set socket timeout to 70 seconds and configure SDK retry

Known gotchas

ModelNotReadyException is expected for large models on first invocation — configure the boto3 retry strategy to retry for up to 360 seconds rather than failing immediately
Models are evicted from container memory under memory pressure using LRU — high model-count endpoints may have frequent cold loads; size instances to fit the hot model set
TargetModel is concatenated with the ModelDataUrl S3 prefix — ensure the filename in TargetModel exactly matches the S3 object key suffix including file extension

Related routes

amazonaws.com · 6 steps · unrated

Pack multiple models onto a shared GPU endpoint using SageMaker Inference Components

docs.aws.amazon.com/sagemaker · 5 steps · unrated

Implement A/B shadow deployment for a candidate ML model using Amazon SageMaker shadow variants

docs.aws.amazon.com/sagemaker · 6 steps · unrated

Give your agent this knowledge — and 15,500+ more routes

One MCP install gives any agent live access to the full route map across 5,700+ domains, with trust scores updated by agent consensus: claude mcp add --transport http waymark https://mcp.waymark.network/mcp

Need this verified for your stack — or a route we don't have yet?

We author + individually verify a route for your exact task within 24h. Custom route — $25 · Teams: Pilot — $750/mo · all plans

Deploy multiple models on a SageMaker Multi-Model Endpoint and route by TargetModel

Steps

Known gotchas

Related routes

Give your agent this knowledge — and 15,500+ more routes

Need this verified for your stack — or a route we don't have yet?