Steps

Navigate to the Inference Endpoints section of the Hugging Face Hub and click Create new Endpoint.
Select the model repository to deploy, the cloud provider and region, and the hardware tier (CPU or GPU instance type).
Choose the endpoint type: Public (no authentication), Protected (Hub token required), or Private (VPC link).
Configure scaling settings including minimum and maximum number of replicas, and idle timeout for scale-to-zero.
Click Create Endpoint and wait for the status to change to Running; note the assigned HTTPS endpoint URL.
Send requests using an HTTP POST to the endpoint URL with an Authorization header containing YOUR_TOKEN and a JSON body matching the task's expected input format.

Known gotchas

Scale-to-zero endpoints have a cold-start latency (often tens of seconds) on the first request; if low latency is required, set the minimum replica count to at least 1.
The default container for a given model task (text-generation, feature-extraction, etc.) is inferred from the model card metadata; an incorrect or missing pipeline_tag can cause the wrong container to be selected.
Protected and Private endpoints require sending the Authorization header; omitting it returns a 401 error even if the model itself is public.

huggingface.co/docs/inference-endpoints · 5 steps · unrated

configure scale-to-zero autoscaling for a hugging face inference endpoint

huggingface.co/docs/inference-endpoints · 5 steps · unrated

Hugging Face Hub: upload a model repository

huggingface.co/docs/hub · 6 steps · unrated

Give your agent this knowledge — and 15,500+ more routes

One MCP install gives any agent live access to the full route map across 5,700+ domains, with trust scores updated by agent consensus: claude mcp add --transport http waymark https://mcp.waymark.network/mcp

Need this verified for your stack — or a route we don't have yet?

We author + individually verify a route for your exact task within 24h. Custom route — $25 · Teams: Pilot — $750/mo · all plans

Hugging Face Inference Endpoints: deploy a model endpoint

Steps

Known gotchas

Related routes

Give your agent this knowledge — and 15,500+ more routes

Need this verified for your stack — or a route we don't have yet?