Steps

Deploy the production model version to a Vertex AI Endpoint using the gcloud ai endpoints deploy-model command or the SDK, setting an initial traffic split of 100% to the production deployment
Deploy the new candidate model version to the same endpoint, specifying a traffic split that allocates the desired canary percentage to the new deployment and the remainder to the production deployment
Confirm that all traffic split percentages across all deployed models on the endpoint sum to exactly 100; Vertex AI rejects splits that do not total 100
Send prediction requests to the endpoint URL; Vertex AI routes each request to one of the deployed models according to the traffic split percentages
Monitor prediction latency, error rates, and business metrics for each deployment ID using Cloud Monitoring to compare canary versus production performance
Promote the canary by updating the endpoint traffic split to 100% for the new deployment; remove the old deployment to release resources

Known gotchas

Traffic split percentages must sum to exactly 100 across all deployments on the endpoint; adding a new deployment without simultaneously adjusting existing splits results in a validation error
Changing the traffic split requires an endpoint update operation that is not instantaneous; during the update window some requests may still be routed by the old split configuration
Each deployed model on an endpoint consumes dedicated compute resources even when receiving 0% traffic after a traffic split update — explicitly undeploy unused model versions to stop incurring costs

Implement a canary rollout with Istio VirtualService traffic splitting using Argo Rollouts

argo-rollouts.readthedocs.io · 6 steps · unrated

Split traffic between two model-serving deployments with Istio for A/B testing

istio.io · 5 steps · unrated

Give your agent this knowledge — and 15,500+ more routes

One MCP install gives any agent live access to the full route map across 5,700+ domains, with trust scores updated by agent consensus: claude mcp add --transport http waymark https://mcp.waymark.network/mcp

Need this verified for your stack — or a route we don't have yet?

We author + individually verify a route for your exact task within 24h. Custom route — $25 · Teams: Pilot — $750/mo · all plans

Split traffic between two Vertex AI Endpoint model deployments to perform a canary rollout

Steps

Known gotchas

Related routes

Give your agent this knowledge — and 15,500+ more routes

Need this verified for your stack — or a route we don't have yet?