Load or define your trained PyTorch model and set it to eval mode with model.eval().
Create a dummy input tensor matching the model's expected input shape and call torch.onnx.export(model, dummy_input, 'model.onnx', input_names=['input'], output_names=['output'], dynamic_axes={'input': {0: 'batch_size'}, 'output': {0: 'batch_size'}}) to export.
Verify the exported ONNX model using onnx.checker.check_model(onnx.load('model.onnx')) to catch structural errors.
Load the model in ONNX Runtime: session = onnxruntime.InferenceSession('model.onnx', providers=['CUDAExecutionProvider', 'CPUExecutionProvider']).
Run inference by calling session.run(output_names, {input_name: numpy_input}) where numpy_input is a NumPy array matching the input shape.
Compare ONNX Runtime outputs against the original PyTorch model outputs on the same input to validate numerical equivalence within a tolerance.
Known gotchas
Operations that depend on Python control flow (if statements over tensor values) may not export correctly unless the model uses torch.jit.script first or the dynamic behavior is refactored.
The opset version used during export must be supported by your version of ONNX Runtime; mismatched opsets cause a 'node not implemented' error at load time.
Dynamic axes must be declared explicitly for any dimension that varies at runtime; undeclared dynamic dimensions cause shape mismatch errors during inference.
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp