The Lambda Deep Learning Blog

Published on December 20, 2023 by Chuan Li

Benchmarking ZeRO-Inference on the NVIDIA GH200 Grace Hopper Superchip

This blog explores the synergy of DeepSpeed’s ZeRO-Inference, a technology designed to make large AI model inference more accessible and cost-effective, with ...

Published on November 21, 2023 by Chuan Li

Unleashing the power of Transformers with NVIDIA Transformer Engine

In this blog, Lambda showcases the capabilities of NVIDIA’s Transformer Engine, a cutting-edge library that accelerates the performance of transformer models ...

DeepChat 3-Step Training At Scale: NVIDIA H100 SXM5 vs A100

Published on October 12, 2023 by Chuan Li

DeepChat 3-Step Training At Scale: Lambda’s Instances of NVIDIA H100 SXM5 vs A100 SXM4

GPU benchmarks on Lambda’s offering of the NVIDIA H100 SXM5 vs the NVIDIA A100 SXM4 using DeepChat’s 3-step training example.

NVIDIA H100 vs A100 Benchmarks for FlashAttention-2 on Lambda Cloud

Published on August 24, 2023 by Chuan Li

How FlashAttention-2 Accelerates LLMs on NVIDIA H100 and A100 GPUs

This blog post walks you through how to use FlashAttention-2 on Lambda Cloud and outlines NVIDIA H100 vs NVIDIA A100 benchmark results for training GPT-3-style ...

Published on November 1, 2022 by Lauren Watkins

Voltron Data Case Study: Why ML teams are using Lambda Reserved Cloud Clusters

One of the biggest trends in machine learning is the development of large transformer models like BERT and diffusion models like stable diffusion. These large ...

Published on September 22, 2021 by Chuan Li

Tesla A100 Server Total Cost of Ownership Analysis

This post compares the Total Cost of Ownership (TCO) for Lambda servers and clusters vs cloud instances with NVIDIA A100 GPUs. We first calculate the TCO for ...

Published on October 6, 2020 by Stephen Balaban

Lambda Echelon – a turn key GPU cluster for your ML team

Introducing the Lambda Echelon Lambda Echelon is a GPU cluster designed for AI. It comes with the compute, storage, network, power, and support you need to ...

Published on May 22, 2020 by Stephen Balaban

NVIDIA A100 GPU Benchmarks for Deep Learning

Lambda customers are starting to ask about the new NVIDIA A100 GPU and our Hyperplane A100 server. The A100 will likely see the large gains on models like ...