8x NVIDIA B200 instances are now available on-demand! Launch today 

THE Lambda
Deep Learning Blog

Recent
Lambda Inference API

gpt-oss-20b

Introduction The GPT-OSS 20B model is a type of transformer-based language model designed for text generation tasks. It was developed by OpenAI using a large ...

Lambda Inference API

gpt-oss-120b

Introduction The GPT-OSS 120B model is a type of transformer-based language model designed for text generation tasks. It was developed by OpenAI using a ...

Lambda Inference API

qwen-image

Introduction Qwen-Image is a model designed for image-related tasks, leveraging the capabilities of the diffusers library. Its architecture is based on a ...

Lambda Inference API

kimi-k2-instruct

Introduction The Kimi K2 Instruct model is a text-generation model developed by moonshotai, built on the transformers library and based on the kimi_k2 ...

Lambda Inference API

apriel-5b-instruct

Introduction The Apriel 5B Instruct model is a transformer-based text-generation model developed by ServiceNow-AI. It belongs to the Apriel model family and is ...

Lambda Inference API

deepseek-r1-0528

Introduction The DeepSeek R1 0528 model is based on the deepseek_v3 architecture and utilizes fp8 quantization. This model type is often used for tasks that ...

Lambda Inference API

qwen3-235b-a22b-fp8

Introduction The Qwen3 235B A22B FP8 model is a type of qwen3_moe model, which is a mixture of experts model designed for text-generation tasks. This model ...

Lambda Inference API

qwen3-32b

Introduction The Qwen3 32B model is a type of transformer-based model designed for text-generation tasks. It was developed by Qwen and is part of the Qwen3 ...

Lambda Inference API

llama3.1-nemotron-70b-instruct

Introduction The Llama 3.1 Nemotron 70B Instruct model is a large language model developed by NVIDIA. It is based on the NeMo library and utilizes a specific ...

Lambda Inference API

hermes-3-llama-3.1-405b-fp8

Introduction The Hermes 3 Llama 3.1 405B model is a type of Llama model, which is an architecture used for text-generation tasks. This model was developed by ...

To top