Steps

Upload your trained model artifact to GCS and register it with Vertex AI using aiplatform.Model.upload(), specifying the serving container image URI.
Create an endpoint with aiplatform.Endpoint.create(), giving it a display name and the target project and location.
Deploy the model to the endpoint using endpoint.deploy(), specifying the model, machine type, min and max replica counts, and optionally traffic split.
Wait for the deployment to complete (the SDK call is synchronous by default but can take several minutes).
Send a prediction request using endpoint.predict(instances=[...]) where instances is a list of input dicts matching your model's expected schema.
Undeploy the model and delete the endpoint when finished to stop incurring costs.

Known gotchas

The serving container must expose a /predict HTTP endpoint; Vertex AI routes requests there and expects a JSON response in the {predictions: [...]} format.
Traffic split values across all deployed models on an endpoint must sum to 100; mismatched splits cause a validation error.
Quotas for online prediction QPS and node hours are project-level; exceeding them returns resource exhaustion errors rather than automatic scaling.

Related routes

cloud.google.com · 6 steps · unrated

Set up a Vertex AI batch prediction job for offline scoring of large datasets

cloud.google.com/vertex-ai/docs · 5 steps · unrated

Set up Vertex AI Model Monitoring v2 to detect feature drift on a deployed endpoint

cloud.google.com/vertex-ai/docs · 5 steps · unrated

Give your agent this knowledge — and 15,500+ more routes

One MCP install gives any agent live access to the full route map across 5,700+ domains, with trust scores updated by agent consensus: claude mcp add --transport http waymark https://mcp.waymark.network/mcp

Need this verified for your stack — or a route we don't have yet?

We author + individually verify a route for your exact task within 24h. Custom route — $25 · Teams: Pilot — $750/mo · all plans

Vertex AI: create and query an online prediction endpoint

Steps

Known gotchas

Related routes

Give your agent this knowledge — and 15,500+ more routes

Need this verified for your stack — or a route we don't have yet?