The Lambda Deep Learning Blog

Published on July 16, 2025 by Anket Sah

Accelerate Your AI Workflow with FP4 Quantization on Lambda

As AI models grow in complexity and size, the demand for efficient computation becomes paramount. FP4 (4-bit Floating Point) precision emerges as a ...

Published on June 4, 2025 by Anket Sah

DeepSeek-R1-0528: The Open-Source Titan Now Live on Lambda’s Inference API

DeepSeek has just leveled up. The latest release, DeepSeek-R1-0528, is now available on Lambda’s Inference API, delivering a formidable blend of mathematical ...

Published on November 22, 2024 by Thomas Bordes

Putting the NVIDIA GH200 Grace Hopper Superchip to good use: superior inference performance and economics for larger models

When it comes to large language model (LLM) inference, cost and performance go hand-in-hand. Single GPU instances are practical and economical; however, models ...

Published on December 20, 2023 by Chuan Li

Benchmarking ZeRO-Inference on the NVIDIA GH200 Grace Hopper Superchip

This blog explores the synergy of DeepSpeed’s ZeRO-Inference, a technology designed to make large AI model inference more accessible and cost-effective, with ...

Published on November 21, 2023 by Chuan Li

Unleashing the power of Transformers with NVIDIA Transformer Engine

In this blog, Lambda showcases the capabilities of NVIDIA’s Transformer Engine, a cutting-edge library that accelerates the performance of transformer models ...

DeepChat 3-Step Training At Scale: NVIDIA H100 SXM5 vs A100

Published on October 12, 2023 by Chuan Li

DeepChat 3-Step Training At Scale: Lambda’s Instances of NVIDIA H100 SXM5 vs A100 SXM4

GPU benchmarks on Lambda’s offering of the NVIDIA H100 SXM5 vs the NVIDIA A100 SXM4 using DeepChat’s 3-step training example.

NVIDIA H100 vs A100 Benchmarks for FlashAttention-2 on Lambda Cloud

Published on August 24, 2023 by Chuan Li

How FlashAttention-2 Accelerates LLMs on NVIDIA H100 and A100 GPUs

This blog post walks you through how to use FlashAttention-2 on Lambda Cloud and outlines NVIDIA H100 vs NVIDIA A100 benchmark results for training GPT-3-style ...

Published on October 31, 2022 by Chuan Li

NVIDIA GeForce RTX 4090 vs RTX 3090 Deep Learning Benchmark

Available October 2022, the NVIDIA® GeForce RTX 4090 is the newest GPU for gamers, creators, students, and researchers. In this post, we benchmark RTX 4090 to ...

Published on October 5, 2022 by Eole Cervenka

All You Need Is One GPU: Inference Benchmark for Stable Diffusion

UPDATE 2022-Oct-13 (Turning off autocast for FP16 speeding inference up by 25%) What do I need for running the state-of-the-art text to image model? Can a ...

Published on October 5, 2022 by Chuan Li

NVIDIA H100 Tensor Core GPU - Deep Learning Performance Analysis

We have seen groundbreaking progress in machine learning over the last couple of years. At the same time, massive usage of GPU infrastructure has become key to ...

Published on November 30, 2021 by Chuan Li

NVIDIA A40 Deep Learning Benchmarks

NVIDIA® A40 GPUs are now available on Lambda Scalar servers. In this post, we benchmark the A40 with 48 GB of GDDR6 VRAM to assess its training performance ...

Published on September 22, 2021 by Chuan Li

Tesla A100 Server Total Cost of Ownership Analysis

This post compares the Total Cost of Ownership (TCO) for Lambda servers and clusters vs cloud instances with NVIDIA A100 GPUs. We first calculate the TCO for ...

Published on August 9, 2021 by Chuan Li

RTX A6000 vs RTX 3090 Deep Learning Benchmarks

Check out the discussion on Reddit 160 upvotes, 41 comments

Published on January 28, 2021 by Michael Balaban

A100 vs V100 Deep Learning Benchmarks

Check out the discussion on Reddit 195 upvotes, 23 comments

Published on January 4, 2021 by Michael Balaban

RTX A6000 Deep Learning Benchmarks

Lambda is now shipping RTX A6000 workstations & servers. In this post, we benchmark the RTX A6000's PyTorch and TensorFlow training performance. We compare ...

The Lambda
Deep Learning Blog

Recent