Kimi K2 Thinking: what 200+ tool calls mean for production
TL;DR: Kimi K2 Thinking is Moonshot AI's open-source reasoning model, scoring 44.9% on Humanity's Last Exam with the ability to chain 200-300 sequential tool ...
Published on by Lea Alcantara
TL;DR: Kimi K2 Thinking is Moonshot AI's open-source reasoning model, scoring 44.9% on Humanity's Last Exam with the ability to chain 200-300 sequential tool ...
Published on by Cody Brownstein
TL;DR: SkyPilot is an open-source orchestration tool that automates ML job deployment on Lambda Cloud. This tutorial covers installation, configuration, and ...
Published on by Lea Alcantara
2025 was a year of momentum in AI. Intelligence progressed through new, innovative methods. Open-source communities released competitive models. Research labs ...
Published on by Jessica Nicholson
This guide demonstrates how to scale JAX-based LLM training from a single GPU to multi-node clusters on NVIDIA Blackwell infrastructure. We present a ...
Published on by Zach Mueller
When your model doesn’t fit on a single GPU, you suddenly need to target multiple GPUs on a single machine, configure a serving stack that actually uses all ...
Published on by Jessica Nicholson
JAX unlocks distinct advantages on GPUs: automatic kernel fusion via XLA, composable transformations, and hardware-agnostic code that moves between ...
Published on by Anket Sah
Inference at scale is still too slow. Large models often stall under real-world load, burning time, compute, and user trust. That’s the problem we set out to ...
Published on by Jessica Nicholson
Introduction Graphics Processing Units (GPUs) were originally designed to handle computer graphics, like making video games look realistic or helping Netflix ...
Published on by Anket Sah
NVIDIA Blackwell GPUs are now available as 8x Lambda Instances On-Demand, featuring the powerful NVIDIA HGX™ B200 in addition to our trusted lineup.
Published on by Jessica Nicholson
If you've been anywhere near LLMs lately, you've probably heard the word "reasoning" thrown around more than a frisbee at a college campus. GPT-4 can "reason" ...
Published on by Anket Sah
Lambda’s 1-Click Clusters(1CC) provide AI teams with streamlined access to scalable, multi-node GPU clusters, cutting through the complexity of distributed ...
Published on by Anket Sah
As AI models grow in complexity and size, the demand for efficient computation becomes paramount. FP4 (4-bit Floating Point) precision emerges as a ...
Published on by Anket Sah
In AI, scaling doesn’t always mean “bigger.” That’s why we champion lean, efficient LLM design, that maximizes performance while minimizing compute cost and ...
Published on by dstack
Lambda + dstack: Empowering your ML team with rock-solid infrastructure for distributed reasoning agent training
Published on by Anket Sah
DeepSeek has just leveled up. The latest release, DeepSeek-R1-0528, is now available on Lambda’s Inference API, delivering a formidable blend of mathematical ...