Pricing

Mission-critical inference performance and reliability, In our cloud or yours.

Starter

Pay As You Go

Learn and prototype with no up-front commitment.

Everything you need to get started:

Dedicated deployments

Pay only compute you use

Fast cold start and auto-scaling

SOC 2 Type II compliant

Monitoring and logging dashboard

Community Slack support

Start building now

Scale

Committed Use Discount

Cost-efficient scaling for growing workloads.

Confidently scale inference with:

Priority access to H100, H200 and more

Unlimited seats and deployments

Dedicated compute pool and cold-start guarantee

Region selection

Dedicated Slack channel

Get a quote

Enterprise

Custom

Full control and dedicated support in your environment.

Inference ready for Enterprises:

Full control in your VPC or on-prem

Tailored performance research and tuning

Custom SLAs

Use existing cloud commitments

Full control over data and network policies

Multi-cloud, hybrid compute orchestration

Audit logs, SSO, compliance evidence kit

Dedicated support engineering

Get in touch

Compute Costs

Applicable when deployed in our cloud. For self-host pricing, contact us to learn more.

GPU

Committed Use

On-Demand

Nvidia T4

16GB VRAM 8 vCPU, 16 GB RAM

$0.51 / hr

Nvidia L4

24GB VRAM 12 vCPU, 24 GB RAM

$0.80 / hr

Nvidia H100

80GB VRAM 16 vCPU, 200 GiB RAM

$2.65 / hr

Nvidia H200

141GB VRAM 16 vCPU, 200 GiB RAM

$2.90 / hr

Nvidia B200

180GB VRAM 20 vCPU, 224 GiB RAM

$4.20 / hr

CPU

Committed Use

On-Demand

cpu.1

1 vCPU, 4 GB RAM

$0.0484 / hr

cpu.2

2 vCPU, 8 GB RAM

$0.0967 / hr

cpu.4

4 vCPU, 16 GB RAM

$0.1935 / hr

cpu.8

8 vCPU, 32 GB RAM

$0.3869 / hr

cpu.16

16 vCPU, 64 GB RAM

$0.7738 / hr

Commonly Asked Questions

Ready to accelerate your AI inference?

Talk to our engineers to discuss how we can help build an inference solution that’s faster, more cost-efficient, and tailored to your needs.

Book a Demo

Pricing

Starter

Pay As You Go

Everything you need to get started:

Scale

Committed Use Discount

Confidently scale inference with:

Enterprise

Custom

Inference ready for Enterprises:

Compute Costs

Commonly Asked Questions

What does the free trial include?

How do I estimate my monthly bill?

Are deployments scaled-to-zero billed?

What’s the SLA for cold starts and uptime?

Which regions can I deploy to?

How do network egress and storage fees factor into my bill?

What’s the process and timeline for Enterprise onboarding?

How will I be billed?

How do I access usage reports and cost breakdowns?

What kind of support do you offer?

Ready to accelerate your AI inference?

Products

Resources

Company

Join our community