Define a BentoML Service class with @bentoml.service decorator and an @bentoml.api method specifying input/output types using pydantic or numpy schemas
Save a trained model to the BentoML model store with bentoml.sklearn.save_model('my_model', clf) and reference it in the service via bentoml.sklearn.get('my_model:latest')
Build a Bento with bentoml build which packages the service code, model artifacts, and dependencies into a versioned bundle
Containerize with bentoml containerize <bento_tag> to produce a Docker image, then push it to a container registry
Deploy to Kubernetes using a Deployment manifest referencing the image, or use bentoml deployment create with a BentoCloud or Yatai backend
Known gotchas
BentoML's bentofile.yaml must pin all Python dependencies explicitly — relying on implicit transitive dependencies causes non-reproducible builds across environments
The @bentoml.api decorator infers input/output schemas from Python type annotations; using Any type or missing annotations silently disables schema validation and may accept malformed requests
Containerization bakes the model artifact into the image by default — for large models (>1 GB) consider using an external model store (S3 or BentoCloud) and lazy loading at startup to keep image sizes manageable
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp