8x NVIDIA B200 instances are now available on-demand! Launch today 

Lambda’s Q2 2025 Launches: AI Innovation at Warp Speed

As we wrap up Q2, it’s time to celebrate Lambda’s relentless drive to push the boundaries of AI infrastructure and tooling. From April through June, we introduced a suite of products and updates that empower developers, researchers, and enterprises to move faster, scale smarter, leverage the latest and greatest open source models, and run workloads in a more secure environment. 

April: Benchmarking Brilliance & DeepSeek V3-0324

MLPerf Inference v5.0: A New Standard in AI Performance

Lambda’s clusters flexed their muscles in the latest MLPerf Inference v5.0 benchmarks, powered by NVIDIA HGX B200 and NVIDIA H200 systems. Our 8-GPU nodes achieved up to 21% higher throughput compared to previous bests, delivering world-class performance on models like GPT-J, Llama 2 70B, and Mixtral 8x7B.

DeepSeek V3-0324 on Inference: Open-Source, High-Performance Reasoning

DeepSeek V3-0324 launched with 685 billion parameters and a Mixture-of-Experts design, delivering top-tier performance across benchmarks like MATH-500 (94%), MMLU Pro (81.2%), and LiveCodeBench (49.2%). With no rate limiting and a cost-effective pricing model, it’s a developer’s dream.

May: Streamlining Workflows & Expanding Capabilities

Lambda Managed Slurm: Cluster Management Made Easier

Lambda’s Managed Slurm keeps your AI clusters running efficiently, maximizing resource use and simplifying complex job scheduling. This fully supported Slurm offering optimizes cluster utilization for AI/ML workloads, with features like LDAP-backed user/group management, container support, and high availability for master daemons. 

Customer Trust Portal: Transparency at Its Best

We launched a fully-fledged Customer Trust Portal in partnership with Safebase by DRATA, consolidating our security docs, certifications, and related materials in one spot. This portal is our transparency power move, offering access to SOC 2 Type II reports, pentest summaries, and policy documents covering Data Management, Encryption Standards, Incident Response, and more.

Filesystem S3 Adapter

The new Filesystem S3 Adapter enables a subset of the S3 APIs: getObject, putObject, deleteObject, and list, directly with Lambda’s Filesystem storage. This eliminates the need for provisioning a VM just to move files, streamlining AI workflows.

Cloud Metrics Dashboard: Real-Time Insights for GPU Workloads

The Lambda Cloud Metrics Dashboard provides real-time visibility into infrastructure, offering hardware-level metrics like GPU and VRAM usage, directly from Lambda GPU Cloud dashboard. This feature helps users make informed decisions and quickly identify potential issues.

Deploy Models with MLflow on Lambda Cloud

MLflow with Lambda GPU Cloud highlights how you can run MLflow on Lambda’s infrastructure today. It shows how to leverage Lambda’s powerful compute to streamline your machine learning lifecycle, from tracking experiments to deploying models, using MLflow’s existing capabilities on our platform.

Qwen3 32B on Lambda’s Inference API

Alibaba’s Qwen3-32B, a dense model with 32 billion parameters, was made available on Lambda’s Inference API. Designed for complex tasks that would otherwise require human intervention, it offers hybrid reasoning, multilingual support, and agentic capabilities, making it a powerful tool for developers.

June: Open-Source Excellence & Agent Training

DeepSeek-R1-0528: The Open-Source Titan Live on Lambda’s Inference API

DeepSeek-R1-0528 builds upon the DeepSeek-V3 architecture, employing FP8 quantization and reinforcement learning to enhance its capability to handle complex computations efficiently. It achieved an impressive 87.5% accuracy in the AIME 2025 benchmark and a 73.3% score on LiveCodeBench, outpacing its predecessor and competing models.

Wrapping Up Q2: A Quarter of Innovation

Lambda’s Q2 2025 launches reflect our commitment to providing cutting-edge tools and infrastructure for building Gigawatt-scale AI factories for training and inference. From benchmarking excellence to streamlining workflows and expanding capabilities, we’ve empowered our customers to push the boundaries of what’s possible in AI.

As we look ahead, we remain dedicated to innovation and transparency thereby empowering SuperIntelligence, Enterprises and Builders alike. 

Looking for a gpu cloud? Design your AI Factory today!