Simplify Model Deployment

BentoML is an open platform that simplifies ML model deployment and enables you to serve your models at production scale in minutes.

For TeamsGalleryCommunityDocs

Find the right tool for your model serving needs

Create your prediction service

High performance model serving

100x the throughput of your regular flask based model server, thanks to our advanced micro-batching mechanism. Read about the benchmarks here.

MLOps best practices baked in

Organizations using and contributing to BentoML


Built to work with DevOps & Infrastructure tools



Yatai for teams with mature ML workflows

Collaboration is better

Best Practices in Kubernetes

Copyright 2022 BentoML

Sign up for managed Yatai

Central Dashboard for pushing, pulling and rolling back models for streamlined deployment. No more complicated handoffs between data science and engineering teams

Managed deployment implementing all the best practices from prediction service registry, deployment automation, to endpoint monitoring, all configured automatically for your team. A solid foundation for running serious ML workloads in production.

We automatically generate monitoring endpoints, deployment templates and documentation for your ML service.

Use our library of integrations to package your model, enabling both online and offline serving on any cloud platform.

GalleryCommunityDocumentationQuickstart guideContactGithub

BentoML supports all major ML frameworks

Build amazing ML services


Sentiment analysis with BERT

Image Classification

Check out the gallery

The service predicts if a movie review is good or bad using a pre-trained BERT model with the TensorFlow framework.

Titanic Survival Prediction

This service identifies objects in a given image using ResNet50 from ONNX model zoo

This prediction service predicts the survival rate of a given passenger on the Titanic using a model trained with XGBoost framework.

Join our Slack community!

Core ML

$ pip install bentoml --pre

The slack app is free and allows the community to better connect :)

BentoML for fast and simple

Out of the box framework for versioning, deploying and monitoring your models

End to end solution for collaboration between data science and engineering

Get started with BentoML!Blog