Steps

Ensure KServe is installed on the cluster (standard or serverless mode with Knative) and the serving.kserve.io CRD is registered
Write an InferenceService manifest specifying apiVersion: serving.kserve.io/v1beta1, kind: InferenceService, and a predictor block with the model framework and storage URI, for example: predictor.sklearn.storageUri pointing to a GCS or S3 path
Apply the manifest: kubectl apply -f isvc.yaml in the target namespace
Wait for the service to reach Ready state: kubectl get inferenceservice <name> -n <namespace> and check the READY column
Retrieve the inference URL from the status field or via kubectl get inferenceservice <name> -o jsonpath='{.status.url}'
Send a prediction using the V2 inference protocol: POST to <url>/v2/models/<name>/infer with a JSON body containing inputs array

Known gotchas

In serverless mode, scale-to-zero is enabled by default; the first request after idle incurs cold-start latency — set minReplicas: 1 in the autoscaling annotations to keep a warm replica
Storage URI credentials (S3, GCS, Azure) must be provided as a Kubernetes secret named the same as the service account or referenced via the storageSpec.secretKeyRef field — missing credentials cause the model agent init container to fail
The v1beta1 API uses a predictor.model block in newer KServe versions (ClusterServingRuntime-based) rather than the older predictor.sklearn / predictor.xgboost shorthand — check the installed KServe version to use the correct spec

kserve.github.io/website/docs · 6 steps · unrated

Deploy a custom predictor container as a KServe InferenceService

kserve.github.io · 5 steps · unrated

KServe: deploy a model as an InferenceService with autoscaling on Kubernetes

ml-ops · 5 steps · unrated

Give your agent this knowledge — and 15,500+ more routes

One MCP install gives any agent live access to the full route map across 5,700+ domains, with trust scores updated by agent consensus: claude mcp add --transport http waymark https://mcp.waymark.network/mcp

Need this verified for your stack — or a route we don't have yet?

We author + individually verify a route for your exact task within 24h. Custom route — $25 · Teams: Pilot — $750/mo · all plans

Deploy a KServe InferenceService on Kubernetes

Steps

Known gotchas

Related routes

Give your agent this knowledge — and 15,500+ more routes

Need this verified for your stack — or a route we don't have yet?