The Lambda Deep Learning Blog

Lambda Cloud Deploys NVIDIA H100 Tensor Core GPUs

Written by Kathy Bui | May 10, 2023 6:33:33 PM

We have some pretty big news to share! Lambda Cloud has deployed a fleet of NVIDIA H100 Tensor Core GPUs, making it one of the first to market with general-availability, on-demand H100 GPUs. Currently, 1x NVIDIA H100 GPU PCIe Gen5 instances are live on Lambda Cloud for only $2.40/GPU/hr. This instance type allows Lambda Cloud users to experience H100 GPUs with a 1x instance before scaling.

1x NVIDIA H100 PCIe GPU Specs

  • GPU memory: 80GB

  • vCPUs cores: 26

  • System memory: 200GiB

  • Storage: 512GB

NVIDIA H100 PCIe GPUs offer our customers a powerful, cost-effective way to accelerate their high-performance computing and deep learning workloads. The high-performance GPUs enable faster training times, better model accuracy, and increased productivity.

NVIDIA H100 SXM next to launch

NVIDIA H100 SXM GPUs in Lambda Cloud will closely follow the release of NVIDIA H100 PCIe GPUs. For those working with large language models that involve structured sparsity and large-scale distributed workloads, it’s time to start thinking about upgrading to the H100 to speed up your training. Capacity is limited, with more coming soon. 

Lambda features flexible deployment options

Lambda offers several options for teams who need access to NVIDIA H100 GPUs depending on their unique needs:

Lambda Cloud: On-demand Access

Lambda Cloud publicly offers NVIDIA H100 GPUs. Start training in seconds with one-click Jupyter access and instances pre-configured for machine learning. Lambda Cloud has transparent pricing with pay-by-the-second billing. Only pay when your instance is running — with no long-term commitment necessary.

Lambda Private Cloud: Fully Integrated Private Cloud Clusters

Lambda Private Cloud Clusters provide dedicated GPU clusters using the same NVIDIA H100 GPU platforms, high-bandwidth networking, and parallel storage as our on-prem hardware without the capital infrastructure cost. Private clusters are engineered for distributed training and provide high-performance H100 machine learning infrastructure that is scalable from 10s to 1,000s of GPUs with contract lengths starting at three years.

Lambda Scalar, Lambda Hyperplane, and NVIDIA DGX: Your Server Installed in Your Data Center

Lambda Scalar servers are configurable with 1x to 8x NVIDIA H100 PCIe GPUs with NVLink interconnected pairs. Lambda Hyperplane HGX servers come equipped with 4x or 8x NVIDIA H100 SXM GPUs with NVLink and NVSwitch interconnected fabric for the highest-performance GPU-to-GPU performance. Lambda also offers NVIDIA DGX systems, which feature 8x H100 SXM GPUs, NVLink and NVSwitch fabric, and NVIDIA Base Command software for AI workflow management, cluster management, computer, storage and network infrastructure acceleration libraries, and an OS optimized for AI workloads.

Lambda Echelon: Your Cluster Installed in Your Data Center

Lambda Echelon clusters are engineered for machine learning workloads and optimized for distributed training. They are scalable from 10s to 1,000s of NVIDIA GPUs and integrated with high-bandwidth networking, parallel storage, and MLOps platforms. Delivered fully racked, configured, and ready to install. Clusters can be built with Lambda Hyperplane HGX-based servers or NVIDIA DGX systems. 

Lambda Colocation: Your Hardware Installed in Lambda's Data Center

Lambda Colocation allows you to deploy individual servers or full clusters faster by leveraging Lambda’s data center infrastructure, which is optimized for the power and thermal demands of NVIDIA H100 GPU platforms. Installed and supported on-site by Lambda’s engineering team, colocation reduces downtime and allows for easy upgrades and servicing.

Learn more about Lambda systems powered by NVIDIA H100 GPUs and get started today.