NVIDIA B200s are live on Lambda Cloud! Set up your Demo today! 

AI Cloud Pricing

Clear, straightforward pricing for On-Demand Instances, 1-Click Clusters, Private Cloud, and serverless Inference.

Contact us for reserved capacity at our lowest prices.

1-Click Clusters Pricing

On demand. Self-serve. Short or Long-term. As low as $1.85 for H100 or $2.99 for B200 for committed usage. Learn more

GPU

16-1.5k NVIDIA HGX B200 GPUs
  Commitment As low as (per GPU-hour)
On-demand 1 week+ $3.79
Reserved 1 year $3.49—Contact us
Reserved 2 years $3.29—Contact us
Reserved 3 years $2.99—Contact us
NVIDIA H100 GPUs
  Commitment As low as (per GPU-hour)
On Demand 1 week-3 months $2.69—contact us
Reserved 3-6 months $2.29—contact us
Reserved 6-12 months $2.19—contact us
Reserved 1 year $1.85—contact us

 

On-Demand Cloud Pricing

Access high-power GPUs when you need them. Pay by the minute with no egress fees. Learn more

Choose your GPU configuration:

GPUs VRAM/GPU vCPUs RAM STORAGE PRICE/GPU/HR*
On-demand 8x NVIDIA H100 SXM 80 GB 208 1800 GiB 22 TiB SSD $2.99
Reserved 8x NVIDIA H100 SXM (min. 32 GPUs) 80 GB 208 1800 GiB 22 TiB SSD CONTACT SALES
On-demand 8x NVIDIA A100 SXM 80 GB 240 1800 GiB 19.5 TiB SSD $1.79
On-demand 8x NVIDIA A100 SXM 40 GB 124 1800 GiB 5.8 TiB SSD $1.29
On-demand 8x NVIDIA Tesla V100 16 GB 88 448 GiB 5.8 TiB SSD $0.55
*plus applicable sales tax
GPUs VRAM/GPU vCPUs RAM STORAGE PRICE/GPU/HR*
On-demand 4x NVIDIA H100 SXM 80 GB 104 900 GiB 11 TiB SSD $3.09
On-demand 4x NVIDIA A100 PCIe 40 GB 120 900 GiB 1 TiB SSD $1.29
On-demand 4x NVIDIA A6000 48 GB 56 400 GiB 1 TiB SSD $0.80
*plus applicable sales tax
GPUs VRAM/GPU vCPUs RAM STORAGE PRICE/GPU/HR*
On-demand 2x NVIDIA H100 SXM 80 GB 52 450 GiB 5.5 TiB SSD $3.19
On-demand 2x NVIDIA A100 PCIe 40 GB 60 450 GiB 1 TiB SSD $1.29
On-demand 2x NVIDIA A6000 48 GB 28 200 GiB 1 TiB SSD $0.80
*plus applicable sales tax
GPUs VRAM/GPU vCPUs RAM STORAGE PRICE/GPU/HR*
On-demand 1x NVIDIA GH200  96 GB 64 432 GiB 4 TiB SSD $1.49
On-demand 1x NVIDIA H100 SXM 80 GB 26 225 GiB 2.75 TiB SSD $3.29
On-demand 1x NVIDIA H100 PCIe 80 GB 26 225 GiB 1 TiB SSD $2.49
On-demand 1x NVIDIA A100 SXM 40 GB 30 220 GiB 512 GiB SSD $1.29
On-demand 1x NVIDIA A100 PCIe 40 GB 30 225 GiB 512 GiB SSD $1.29
On-demand 1x NVIDIA A10 24 GB 30 226 GiB 1.3 TiB SSD $0.75
On-demand 1x NVIDIA A6000 48 GB 14 100 GiB 512 GiB SSD $0.80
On-demand 1x NVIDIA Quadro RTX 6000 24 GB 14 46 GiB 512 GiB SSD $0.50
*plus applicable sales tax

Transparent pricing

For On-Demand instances, pay by the minute only for what you use. Get discounted pricing for long-term commitments.

NVIDIA HGX B200 available as low as $2.99/GPU/hour.

Flexible commitments

We know your AI needs can change—and fast.

Build a flexible reservation tailored precisely to your timeline and budget, without getting locked in to a particular GPU.

Unmatched AI expertise

We're engineers, not resellers. We focus on AI/ML infrastructure with a no-BS approach.

Think high-performance compute, transparent pricing, and support from people who get it.

Private Cloud Pricing

Single tenant, caged clusters for large AI deployments. One-, two-, or three-year contracts for 1,000-64k GPUs and 3.2Tb/s networking.

INSTANCE TYPE GPU GPU MEMORY vCPUs STORAGE NETWORK BANDWIDTH
NVIDIA HGX B200 B200 SXM 180 GB 224 60 TB local per 8x B200 3200 Gbps per 8x B200
NVIDIA HGX H200 H200 SXM 141 GB 224 30 TB local per 8x H200 3200 Gbps per 8x H200
NVIDIA H100 H100 SXM 80 GB 224 30 TB local per 8x H100 3200 Gbps per 8x H100

As low as $2.99 for B200 with a multi-year commitment. Learn more

Inference API Pricing

Use the latest open-source models, scale effortlessly, and only pay for the tokens you use with no rate limits. Learn more or read the docs

Model type

 

Model Quantization Context Price per 1M input tokens
Price per 1M output tokens
Deepseek-R1-0528 FP8 164k $0.50 $2.18
DeepSeek-V3-0324 FP8 164k $0.34 $0.88
Qwen-3-32B FP8 41k $0.10 $0.30
Llama-4-maverick-17b-128e-instruct-fp8 FP8 1M $0.18 $0.60
Llama-4-scout-17b-16e-instruct FP8 1M $0.08 $0.30
Llama-3.1-8B-instruct
BF16 131K $0.025 $0.04
Llama-3.1-70B-instruct
FP8 131K $0.12 $0.30
Llama-3.1-405B-instruct
FP8 131K $0.80 $0.80

* plus applicable sales tax

 

Model Quantization Context Price per 1M input tokens
Price per 1M output tokens
DeepSeek-llama3.3-70b FP8 131k $0.20 $0.60
Llama-3.3-70B-instruct  FP8 131K $0.12 $0.30
Llama-3.2-3B-instruct
FP8 131K $0.015 $0.025
Hermes-3-Llama-3.1-8B
BF16 131K $0.025 $0.04
Hermes-3-Llama-3.1-70B (FP8)
FP8 131K $0.12 $0.30
Hermes-3-Llama-3.1-405B (FP8)
FP8 131K $0.80 $0.80
LFM-40b
BF16 66K $0.15 $0.15
Llama3.1-nemotron-70b-instruct FP8 131K $0.12 $0.30
Qwen2.5-Coder-32B
BF16 33K $0.07 $0.16

* plus applicable sales tax

Frequently Asked Questions

What makes Lambda different from other providers?

Lambda is the only cloud provider focused solely on AI. We offer high-performance GPU cloud compute with transparent pricing, and you can get started with 1-Click Clusters with no commitment and no sales call.

What support do you offer?

We offer direct access to leading AI infrastructure engineers who understand your models and requirements—no tiered support queues or generic help desks.

Do you offer Managed Kubernetes and Managed Slurm?

Yes. See our Orchestration page for more details.

What makes Lambda's Inference API different?

Our Inference API features the lowest rates on leading open-source models with no rate limits.

What are 1-Click Clusters?

Lambda's 1-Click Clusters feature 16-512 NVIDIA GPUs, from 1 week to 3 years, joined with Infiniband connectivity.

Is there a minimum commitment for your On-Demand Cloud?

No. Spin up instances when you need them, spin them down when you don’t.

How does Lambda handle security?

Security here isn't just a department; it's a shared mission. We build a culture where everyone understands their part in protecting our systems and—more importantly—our customers' data.

Does Lambda support Enterprises?

Yes. Lambda is an end-to-end and secure AI compute platform: Private Cloud, on-premise GPU servers and desktop workstations, all bolstered by enterprise-grade security (SOC2 Type II compliant), and best-in-class AI-expert support. Learn more