Published on December 22, 2025 by Zach Mueller How to serve Kimi-K2-Instruct on Lambda with vLLM When your model doesn’t fit on a single GPU, you suddenly need to target multiple GPUs on a single machine, configure a serving stack that actually uses all ...