
Introducing the Lambda Inference API: Lowest-Cost Inference Anywhere
Today, we’re excited to announce the GA release of the Lambda Inference API, the lowest-cost inference anywhere. For just a fraction of a cent, you can access ...
Lambda’s 1-Click Clusters(1CC) provide AI teams with streamlined access to scalable, multi-node GPU clusters, cutting through the complexity of distributed infrastructure. Now, we're pushing the envelope further by integrating NVIDIA's Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) into our multi-tenant 1CC environments. This technology reduces communication latency and improves bandwidth efficiency, directly accelerating training speed of distributed AI workloads.
Published on by Anket Sah
Today, we’re excited to announce the GA release of the Lambda Inference API, the lowest-cost inference anywhere. For just a fraction of a cent, you can access ...
Published on by Nick Harvey
When it comes to large language model (LLM) inference, cost and performance go hand-in-hand. Single GPU instances are practical and economical; however, models ...
Published on by Thomas Bordes
We're excited to announce the launch of the NVIDIA GH200 Grace Hopper Superchip on Lambda On-Demand. Now, with just a few clicks in your Lambda Cloud account, ...
Published on by Nick Harvey
Create a cloud account instantly to spin up GPUs today or contact us to secure a long-term contract for thousands of GPUs