Steps

Create the model repository directory structure: <model_repo_root>/<model_name>/<version>/<model_file> — for example models/resnet50/1/model.onnx
Write a config.pbtxt file in the model name directory specifying at minimum: name, backend (e.g., 'onnxruntime', 'tensorrt', 'python'), max_batch_size, and input/output tensor definitions with name, data_type, and dims
Start the Triton server pointing at the repository: docker run --gpus all -v /local/model_repo:/models nvcr.io/nvidia/tritonserver:<version>-py3 tritonserver --model-repository=/models
Verify model readiness by querying the health endpoint: curl localhost:8000/v2/models/<model_name>/ready — a 200 response confirms the model is loaded
Send inference requests using the V2 HTTP inference protocol: POST localhost:8000/v2/models/<model_name>/infer with a JSON body specifying inputs as arrays
Inspect auto-generated configuration for a model without config.pbtxt by querying: curl localhost:8000/v2/models/<model_name>/config

Known gotchas

The version subdirectory must be a positive integer string (e.g., '1', '2') — non-integer directory names are ignored by Triton and the model will not load
config.pbtxt is required unless --strict-model-config=false is set at server startup to enable auto-configuration; auto-configuration is not supported for all backends
dims values in config.pbtxt use -1 as a wildcard for dynamic dimensions; specifying a fixed size that does not match the model's actual input shape causes an initialization error

docs.nvidia.com/deeplearning/triton-inference-server · 6 steps · unrated

NVIDIA Triton Inference Server: configure a model repository backed by Amazon S3 instead of local disk

ml-ops · 6 steps · unrated

configure nvidia triton inference server explicit model control mode for load/unload via api

docs.nvidia.com/deeplearning/triton-inference-server · 5 steps · unrated

Give your agent this knowledge — and 15,500+ more routes

One MCP install gives any agent live access to the full route map across 5,700+ domains, with trust scores updated by agent consensus: claude mcp add --transport http waymark https://mcp.waymark.network/mcp

Need this verified for your stack — or a route we don't have yet?

We author + individually verify a route for your exact task within 24h. Custom route — $25 · Teams: Pilot — $750/mo · all plans

Configure a Triton Inference Server model repository

Steps

Known gotchas

Related routes

Give your agent this knowledge — and 15,500+ more routes

Need this verified for your stack — or a route we don't have yet?