cloud-reserved-hero-image

Reserve thousands of NVIDIA H100s, H200s, and GH200s

Train, customize, and deploy foundation models & LLMs on dedicated Cloud Clusters featuring NVIDIA H100 and H200 Tensor Core GPUs and GH200 Grace Hopper™ Superchip with Quantum-2 InfiniBand Networking

 
TRUSTED BY FORTUNE 500 COMPANIES & AI STARTUPS
generally-intelligent-grey
mit-gray-lockup
voltron-data-grey
writer-gray-logo
sony-1
samsung-grey
picsart-grey
GPU SUPERPOWERS

Finally, cloud computing designed for large scale model training and inference

Lambda Cloud Clusters are designed for machine learning engineers who need the highest-performance NVIDIA GPUs, networking, and storage for large scale distributed training.

Our clusters use a non-blocking NVIDIA Quantum-2 InfiniBand compute network which allows your ML team to spin up one large model across thousands of GPUs with no disruption in networking speed.

Thousands of the most powerful GPUs

Train large-scale models across thousands of NVIDIA H100s, NVIDIA H200s, or NVIDIA GH200s, with no delays or bottlenecks. Access the latest infrastructure, built for your most demanding AI projects.

Absolute fastest GPU compute fabric

Each GPU is paired 1:1 with a dedicated 400 Gbps link to the Lambda Cloud Cluster compute fabric. The optimal networking topology for GPU computing scaling to multi-Petabit per second throughput.

Non-blocking InfiniBand networking

The absolute fastest network available delivering full bandwidth to all GPUs in the cluster simultaneously. Leveraging NVIDIA Quantum-2 InfiniBand with support for GPUDirect RDMA and optimized for massive scale full-cluster distributed training.

ENTERPRISE-GRADE GPUs

Only the highest performance GPUs

Lambda Cloud Clusters leverage only the latest and greatest infrastructure, built for the next generation of LLMs and other large-scale models.

NVIDIA H100 SXM

The NVIDIA H100 Tensor Core GPU has 80GB of HBM3 memory at 3.35TB/s, deployed in HGX 8-GPU nodes with NVLink and NVSwitch interconnects, 4th Gen Intel Xeon processors, Transformer Engine with FP8 precision, and second-generation Multi-Instance GPU technology. Learn more.

NVIDIA H200 SXM

The NVIDIA H200 Tensor Core GPU is the first GPU to offer HBM3e — faster, larger memory to fuel the acceleration of generative AI and LLMs. With HBM3e, H200 delivers 141GB of memory at 4.8TB/s, nearly double the capacity of, and 1.4X more bandwidth than, the NVIDIA H100. Learn more.

NVIDIA GH200

The NVIDIA GH200 Grace Hopper Superchip includes a bidirectional high-bandwidth NVLink-C2C connection between the Grace CPU and Hopper GPU. This enables 7x faster data transfers and memory coherency allowing each GH200 to address up to 576GB of GPU memory. Learn more.

NVIDIA GH200 Superchip
grace-hopper-superchip-no-crop

NVIDIA GH200 Grace Hopper Superchip

NVIDIA Grace CPU with Hopper GPU

72-core ARM CPU with H100 96GB GPU. Single module with NVLink-C2C interconnect that delivers 900 GB/s of bidirectional bandwidth between the Grace CPU and Hopper GPU.

Higher Performance and Faster Memory

Higher bandwidth system and GPU memory. 7x faster CPU to GPU transfers over NVLink.

Coherent Memory Architecture

480 GB system memory + 96 GB GPU memory. Each GH200 GPU can address up to 576 GB of memory.

TRUSTED BY EXPERTS
lambda-tweets-updated-v3

Trusted by world-renowned AI engineers

Lambda's Cloud is used by industry pioneers who have shaped modern deep learning and continue to push what's possible in computer vision, natural language and robotics. 

FLEXIBLE CONTRACTS

The only cloud prioritizing flexibility and value for ML teams

NVIDIA H100

$1.89/hr/H100
 
Minimum Term: 3 Years 
 

NVIDIA H200

H200 Pricing: Contact Sales
  
Minimum Term: 3 Years 
 

NVIDIA GH200

1-Year Reservation: $5.99/hr/GH200 
 
Minimum Term: 3 Months
 
 
NVIDIA ELITE PARTNER

Lambda is proud to be an NVIDIA Elite Cloud Solutions Provider

Lambda has been awarded 2023 Americas NVIDIA Partner Network Solution Integration Partner Of The Year for three consecutive years.

nvidia-elite-partner-logo

Leading enterprises recognize the incredible capabilities of AI and are building it into their operations to transform customer service, sales, operations, and many other key functions. Lambda’s deep expertise, combined with cutting-edge NVIDIA technology, is helping customers create flexible, scalable AI deployments on premises, in the cloud, or at a colocation data center.

 Craig Weinstein, Vice President of the Americas Partner Organization
NETWORKING SPEED
setup

The fastest network for distributed training of LLMs, foundation models & generative AI

Train large foundation models and LLMs with the fastest networking available in any cloud. Our NVIDIA Quantum-2 InfiniBand networking provides 3200 Gbps of bandwidth for each HGX H100 or H200 node.

This design is purpose built for NVIDIA GPUDirect RDMA with maximum inter-node bandwidth and minimum latency across the entire cluster.

The Lambda compute network uses a non-blocking multi-layer topology with zero oversubscription. This provides full networking bandwidth to every NVIDIA GPU in the cluster simultaneously, the optimal design for full-cluster distributed training.

GPUDirect RDMA
GPUDirect_RDMA_Graphic

Skip the CPU and take advantage of GPUDirect RDMA for the fastest distributed training

A direct communication path between NVIDIA GPUs across all nodes in your cluster using NVIDIA Quantum-2 InfiniBand.

GPUDirect RDMA provides a significant decrease in GPU-GPU communication latency and completely offloads the CPU, removing it from all GPU-GPU communications across the network. 

PRICING

The best prices and value for GPU cloud clusters in the industry

  Instance type GPU GPU Memory vCPUs Storage Network Bandwidth Per Hour Price Term # of GPUs
NVIDIA H100 8x NVIDIA H100 H100 SXM 80 GB 224 30 TB local per 8x H100 3200 Gbps per 8x H100 $1.89/H100/hour 3 years 64 - 32,000
NVIDIA H200 8x NVIDIA H200 H200 SXM 141 GB 224 30 TB local per 8x H200 3200 Gbps per 8x H200 Contact Sales 3 years 64 - 32,000
NVIDIA GH200 1x NVIDIA GH200 GH200 Superchip 96 GB 72 30 TB local per GH200 400 Gbps per GH200 $5.99 /GH200/hour 3-12 months 10 or 20
AI SOFTWARE INSTALLED
pre-configured-ml-logos

Pre-configured for machine learning

Start training your models immediately with pre-configured software, shared storage, and networking for deep learning. All you have to do is choose your NVIDIA GPU nodes and CPU nodes.

Lambda Premium Support for Cloud Clusters includes PyTorch, TensorFlow, NVIDIA CUDA, NVIDIA cudNN, Keras and Jupyter. Kubernetes is not included.

SPIN UP AN INSTANCE

Lambda On-Demand Cloud powered by NVIDIA H100 GPUs

NOW AVAILABLE

On-demand HGX H100 systems with 8x NVIDIA H100 SXM GPUs are now available on Lambda Cloud for only $2.59/hr/GPU. With H100 SXM you get: 

  • More flexibility for users looking for more compute power to build, fine-tune, and deploy generative AI models
  • Enhanced scalability
  • High-bandwidth GPU-to-GPU communication
  • Optimal performance density

Lambda Cloud also has 1x NVIDIA H100 PCIe GPU instances at just $1.99/hr/GPU for smaller experiments.