NVIDIA B200s are live on Lambda Cloud! Set up your Demo today! 

Managed and Unmanaged Slurm

Slurm job management optimized for AI workloads is available on Lambda's 1-Click Clusters.

Slurm Job Management for AI Clusters

Our FREE for a limited time Slurm workload scheduler offering includes both unmanaged and managed solutions for H100 Clusters (B200 coming soon) as early preview. Choose unmanaged for full control, or managed to let Lambda handle the administration.

built_for_ai

Managed Slurm: Hands-Off Efficiency

Let us handle the complexities of Slurm administration. Managed Slurm provides all the features of Unmanaged, plus comprehensive support and management by Lambda:
  • Slurm patches
  • Job history tracking
  • Technical support — Lambda partners with SchedMD for backend support
  • Node failure detection and replacement
  • Cluster and Slurm daemon health monitoring, including slurmctl, slurmdbd, and node Slurm
reliabe_and_secure

Unmanaged Slurm: Complete control

Take the reins with Unmanaged Slurm. You get Lambda's optimized Slurm configuration with built-in features for advanced cluster management, including:
 
  • Built-in LDAP auth for user/group management
  • Policies based on cgroups
  • Container support (Pyxis, Enroot)
  • Slurm user, operator, and administrator access
  • High Availability (HA)
easy_to _operate

Deploy seamlessly on Lambda's 1-Click Clusters

Both Unmanaged and Managed Slurm run on Lambda's 1-Click Clusters with NVIDIA H100 and NVIDIA HGX B200 GPUs, providing scalable GPU resources for your AI workloads. 1-Click Clusters logically partitioned Infiniband-connected GPU clusters contracted for 1 to 52 weeks, located in data centers with 8x5 continuous presence and 24x7 on-call availability.