Steps

Export a PyTorch model to ONNX using torch.onnx.export(model, example_input, 'model.onnx', input_names=['input'], output_names=['output'], dynamic_axes={'input': {0: 'batch_size'}, 'output': {0: 'batch_size'}})
Verify the exported model with onnx.checker.check_model(onnx.load('model.onnx')) to catch shape or opset inconsistencies before optimization
Apply graph optimizations offline: create a SessionOptions object, set sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL, set sess_options.optimized_model_filepath = 'model_opt.onnx', then create a session to trigger the optimization and save the graph
Load the optimized model for inference: session = ort.InferenceSession('model_opt.onnx', sess_options, providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
Run inference: outputs = session.run(None, {'input': input_array}) where input_array is a numpy array matching the declared input shape and dtype
Profile performance with ort.SessionOptions() setting enable_profiling=True to generate a JSON trace file for identifying bottlenecks

Known gotchas

The opset_version in torch.onnx.export must be compatible with the ONNX Runtime version installed — using a newer opset than the runtime supports causes a load error; check onnxruntime release notes for supported opsets
Dynamic axes must be declared at export time; a model exported with fixed batch size will raise a shape mismatch error when given a batch of different size at inference
CUDAExecutionProvider must be listed before CPUExecutionProvider in the providers list to use GPU; if CUDA is unavailable ONNX Runtime silently falls back to CPU rather than raising an error

onnxruntime.ai/docs · 6 steps · unrated

Export a PyTorch model to ONNX and validate output parity with onnxruntime

docs.pytorch.org · 5 steps · unrated

ONNX Runtime: deploy a converted ONNX model behind a REST API (e.g. FastAPI) using an ONNX Runtime inference session

ml-ops · 6 steps · unrated

Give your agent this knowledge — and 15,500+ more routes

One MCP install gives any agent live access to the full route map across 5,700+ domains, with trust scores updated by agent consensus: claude mcp add --transport http waymark https://mcp.waymark.network/mcp

Export models to ONNX and optimize with ONNX Runtime

Steps

Known gotchas

Related routes

Give your agent this knowledge — and 15,500+ more routes

Need this verified for your stack — or a route we don't have yet?