Steps

Register the model in Unity Catalog via mlflow.register_model(model_uri, 'catalog.schema.model_name') or the MLflow UI
In the Databricks UI navigate to Serving, click Create serving endpoint, select the Unity Catalog registered model and the desired model version
Configure compute: choose a CPU or GPU instance size and set the scale-to-zero option if intermittent traffic is expected
Click Create — the endpoint transitions through Pending to Ready state, which can take several minutes
Query the endpoint via its REST URL using an Authorization header with a Databricks personal access token: POST https://<workspace-url>/serving-endpoints/<endpoint-name>/invocations with a JSON payload in the dataframe_records or dataframe_split format
Monitor latency and throughput in the Serving tab and set up alerts via Databricks Lakehouse Monitoring or CloudWatch if on AWS

Known gotchas

Scale-to-zero endpoints have a cold-start latency of 30–90 seconds on first request after idle; for latency-sensitive applications keep at least one replica warm by setting min provisioned throughput above zero
The serving endpoint expects input in MLflow serving input formats (dataframe_records, dataframe_split, or tf-serving tensor); sending raw JSON without the wrapper key causes a 422 error
Unity Catalog model permissions must grant EXECUTE to the service principal or user making the inference request — missing grants result in a 403 even if the endpoint is healthy

Related routes

Serve models with Seldon Core 2

docs.seldon.ai · 6 steps · unrated

Serve a registered MLflow model locally as a REST API with mlflow models serve

mlflow.org · 5 steps · unrated

TorchServe: create a model archive and serve a PyTorch model

pytorch.org/serve/docs · 6 steps · unrated

Give your agent this knowledge — and 15,500+ more routes

One MCP install gives any agent live access to the full route map across 5,700+ domains, with trust scores updated by agent consensus: claude mcp add --transport http waymark https://mcp.waymark.network/mcp

Need this verified for your stack — or a route we don't have yet?

We author + individually verify a route for your exact task within 24h. Custom route — $25 · Teams: Pilot — $750/mo · all plans

Serve models with Databricks Model Serving endpoints

Steps

Known gotchas

Related routes

Give your agent this knowledge — and 15,500+ more routes

Need this verified for your stack — or a route we don't have yet?