August 23, 2023 • Written By Sherlock Xu
Today, we are thrilled to unveil our latest member to the BentoML ecosystem — OneDiffusion, an open-source, all-in-one platform specially designed to streamline the deployment of diffusion models. It supports both pretrained and fine-tuned diffusion models with LoRA adapters, allowing you to run a variety of image generation tasks with ease and flexibility. As it is integrated seamlessly with the BentoML framework, you can use OneDiffusion to deploy diffusion models to the cloud or on-premises, and build powerful and scalable AI applications.
As advancements in AI surge forward, diffusion models are carving a niche for themselves, with Stable Diffusion (SD) standing at the forefront of their breakthroughs. Stable Diffusion models excel at generating detailed visuals based on text cues and are able to perform tasks such as inpainting and outpainting. Stable Diffusion XL 1.0, the recent pinnacle of Stability AI’s text-to-image suite, can create vivid images from shorter prompts and even embed textual content within these visuals.
However, diffusion models aren’t without challenges. Their intricate architecture and heavy computational demands make production serving and deployment a daunting task. Traditional deployment methodologies are often unable to cater to the unique requirements of these models, leading to inefficiencies and performance bottlenecks.
At BentoML, we work to empower every organization to compete and succeed with AI applications. We believe that democratizing the serving and deployment of diffusion models represents an important step towards this mission. Following our previous endeavor with OpenLLM, an open-source solution for running inference with any open-source LLMs, we embarked on the journey to create OneDiffusion.
OneDiffusion isn’t just another deployment tool; it’s a tailor-made solution for diffusion models. By offering features specifically designed to address the deployment complexities, OneDiffusion makes deploying diffusion models more straightforward than ever.
OneDiffusion is designed for AI application developers who require a robust and flexible platform for deploying diffusion models in production. Key features include:
To use OneDiffusion, make sure you have Python 3.8 (or later) and
pip installed, and then install OneDiffusion by using
Once it is installed, you can start a Stable Diffusion server by running the following command. By default, OneDiffusion uses
stabilityai/stable-diffusion-2 and it downloads the model automatically to the BentoML Model Store if it has not been registered before.
This starts a server at http://0.0.0.0:3000/. You can interact with it by visiting the Swagger UI or send a request via
To use a specific model version, add the
--model-id option as below:
To specify another pipeline, use the
--pipeline option as below. The
img2img pipeline allows you to modify images based on a given prompt and image.
OneDiffusion also supports running Stable Diffusion XL v1.0. To start an XL server, simply run:
Similarly, visit http://0.0.0.0:3000/ or send a request via
curl to interact with the XL server. Example prompt:
Low-Rank Adaptation (LoRA) is a training method to fine-tune models without the need to retrain all parameters. You can add LoRA weights to your diffusion models for specific data needs.
--lora-weights option as below:
Alternatively, dynamically load LoRA weights by adding the
By specifying the path of LoRA weights at runtime, you can influence model outputs dynamically. Even with identical prompts, the application of different LoRA weights can yield vastly different results. Example output (oil painting vs. pixel):
You can create a BentoML Runner for a diffusion model model by using
bentoml.diffusers_simple.create_runner, which downloads the model specified automatically if it does not exist locally.
You can then wrap the Runner into a BentoML Service. See the BentoML documentation for more details.
You can build a Bento for an existing diffusion model by running
onediffusion build. To specify the model to be packaged into the Bento, use
--model-id. Otherwise, OneDiffusion packages the default model into the Bento.
Once your Bento is ready, you can push it to BentoCloud.
The recent wave of AI has propelled diffusion models to great heights. As these models become indispensable in AI applications, the challenges in deploying them also become pronounced. We recognize that many are daunted by the intricacies of rolling out diffusion models in real-world scenarios. With the open sourcing of OneDiffusion, we look to alleviate these concerns, making the deployment process smoother and more intuitive. However, open source is merely the beginning. Our work extends beyond that, and we look forward to working with the community to improve the project in the following ways:
We invite contributions of all kinds to OneDiffusion! Check out the following resources to start your OneDiffusion journey and stay tuned for more announcements about OneDiffusion and BentoML.
BentoML is the platform for AI developers to build, ship, and scale AI applications. Headquartered in San Francisco, BentoML’s open source products are enabling thousands of organizations’ mission-critical AI applications around the globe. Our serverless cloud platform brings developer velocity and cost-efficiency to enterprise AI use cases. BentoML is on a mission to empower every organization to compete and succeed with AI. Visit our website to learn more.