Inference models

Recent
Lambda Inference API
hermes3-70b

Introduction The Hermes 3 Llama 3.1 70B model is a large language model built on the Llama architecture and fine-tuned for text-generation tasks. It utilizes ...

Lambda Inference API
hermes3-8b

Introduction The Hermes 3 Llama 3.1 8B model is a type of Llama model, which is a class of large language models. It was developed by NousResearch using the ...

Lambda Inference API
llama3.2-3b-instruct

Introduction The Llama 3.2 3B Instruct model is a transformer-based text-generation model developed by meta-llama. It is designed to generate text based on ...

Lambda Inference API
llama3.1-405b-instruct-fp8

Introduction The Llama 3.1 405B Instruct model is a large language model based on the transformer architecture, which is commonly used for natural language ...

Lambda Inference API
llama3.3-70b-instruct-fp8

Introduction The Llama 3.3 70B Instruct model is a large language model developed by meta-llama, utilizing the transformer architecture. It was trained on a ...

Lambda Inference API
deepseek-llama3.3-70b

Introduction The DeepSeek R1 Distill Llama 70B is a transformer-based model, specifically a variant of the Llama architecture. It was developed by deepseek-ai ...

Lambda Inference API
llama3.1-70b-instruct-fp8

Introduction The Llama 3.1 70B Instruct model is a type of transformer-based large language model developed by meta-llama. It is designed for text generation ...

Lambda Inference API
llama3.1-8b-instruct

Introduction The Llama 3.1 8B Instruct model is a type of large language model based on the transformer architecture. It was designed to generate text based on ...

Lambda Inference API
llama-4-scout-17b-16e-instruct

Introduction The Llama 4 Scout 17B 16E Instruct model is a type of LLaMA4 model, which is a library of large language models developed for various natural ...

Lambda Inference API
llama-4-maverick-17b-128e-instruct-fp8

Introduction The Llama 4 Maverick 17B 128E Instruct FP8 model is a type of Llama 4 model, utilizing a compressed tensor quantization approach. This model is ...

Ready to build?

Contact us to learn more about our Managed Kubernetes service and how it can help you accelerate your AI initiatives.