BentoML is an open platform that simplifies ML model deployment and enables you to serve your models at production scale in minutes.
Find the right tool for your model serving needs
100x the throughput of your regular flask based model server, thanks to our advanced micro-batching mechanism. Read about the benchmarks here.
Built to work with DevOps & Infrastructure tools
KNative
CloudRun
Copyright 2022 BentoML
Central Dashboard for pushing, pulling and rolling back models for streamlined deployment. No more complicated handoffs between data science and engineering teams
We automatically generate monitoring endpoints, deployment templates and documentation for your ML service.
Use our library of integrations to package your model, enabling both online and offline serving on any cloud platform.
BentoML supports all major ML frameworks
Build amazing ML services
Featured
Sentiment analysis with BERT
Image Classification
The service predicts if a movie review is good or bad using a pre-trained BERT model with the TensorFlow framework.
Titanic Survival Prediction
This service identifies objects in a given image using ResNet50 from ONNX model zoo
This prediction service predicts the survival rate of a given passenger on the Titanic using a model trained with XGBoost framework.
Core ML
$ pip install bentoml --pre
$ helm repo add yatai https://bentoml.github.io/yatai-chart
The slack app is free and allows the community to better connect :)
BentoML for fast and simple
Out of the box framework for versioning, deploying and monitoring your models
End to end solution for collaboration between data science and engineering