Lambda Echelon

GPU clusters designed for deep learning

Accelerate your team's AI progress with a Lambda Echelon HPC cluster.

Echelon clusters check all of the boxes

[✓] NVIDIA GPU Compute

We use Lambda Scalar and Hyperplane servers with NVIDIA Tensor Core GPUs as the building blocks for your Echelon cluster. With an Echelon cluster, your team will be training in minutes instead of days.
[✓] Fast, All-Flash Storage

Lambda-engineered network storage servers provide rapid access to your training and inference data. The Lambda storage administration dashboard makes managing your cluster a breeze.
[✓] HPC Networking

Rely on Lambda's HPC engineers to design an optimal network topology for you. Whether you want Ethernet or InfiniBand, Echelon network designs leverage GPU Direct RDMA to accelerate multi-node distributed training and data access.
[✓] Enterprise White Glove Support

Lambda Echelon comes racked, stacked, labeled, and cabled. But it doesn't stop there, they also come backed by engineers that love to go above and beyond. Each cluster comes with access to the expertise of the engineers who designed it. We abstract away the complexity of HPC clusters so you can focus on what you do best. Just roll it out of the crate, plug it in, and start training.

All-in-one Rack Level Solution

A single vendor for your entire cluster

One relationship to rule them all

Working with Lambda means you won't be duct-taping a solution together from multiple vendors. Your cluster’s compute, network, and storage are all provided by Lambda, meaning your procurement process is greatly simplified.
Compute, Storage, and Networking that works together

Echelon clusters have compute, storage, and networking architectures that are validated by Lambda HPC engineers and detailed in our whitepaper.
Shipped to you, ready to roll

Echelon clusters can be shipped to you fully assembled, ready to roll out of their rack crate and onto the floor of your data center. All you need to do is plug it in.

Compute

A cluster of Lambda GPU servers

Designed for your team’s use case

Lambda is a team of HPC experts and published AI researchers. Whether you’re looking to set up a traditional HPC cluster, or a cluster for distributed training of language models, we’ll engineer a cluster for your precise use case.
Endlessly customizable

The Echelon cluster design process begins with a Lambda Hyperplane or Lambda Scalar server configuration. This becomes the core compute node used throughout the cluster.

Storage

All-Flash Network Storage with Management Dashboard

Access your data sets and checkpoints at the speed of flash

Storage clusters can often become the bottleneck in large scale deployments. The Echelon reference design combines a high speed storage fabric with local NVMe flash caches to dramatically speed up data transfer rates during training.
Support for dozens of vendors

Lambda has pre-existing OEM relationships with practically every storage appliance provider in the world.
Manage your storage cluster with an easy-to-use web dashboard

Manage your cluster’s storage via the Lambda storage management dashboard. You can easily create and destroy network attached storage volumes, spin up virtual machines on your storage devices, and manage access control.

Networking

Blazing fast HPC networking

100% port-to-port bandwidth spine & leaf topology

Echelon compute nodes communicate via a 200 Gb/s InfiniBand fabric. Each node has eight 200 Gb/s HDR InfiniBand HCAs, providing a theoretical peak node-to-node bandwidth of over 200 gigabytes per second.

Support

World class support

Get phone support directly from an AI infrastructure engineer

Having successfully deployed thousands of nodes, Lambda’s team consists of seasoned HPC experts. When you need help with your cluster, we’ll be there for you.
Support that covers your entire cluster

Lambda Echelon support doesn’t just stop at the hardware. Our Premium and Max support tiers provide end-to-end cluster support. Whether you have a hardware, software, or Linux system administration question, we’ll be there to help.

AI Economics

A fraction of the cost of cloud

Industry Defining TCO

If you're a heavy GPU cloud compute customer, say goodbye to your monthly AWS bill. With Echelon, you can expect a TCO of anywhere from one half to one fifth of what you’re paying on AWS.

Deep learning infrastructure for your data center

Multi-node distributed training

Echelon comes with an optimized network topology. Whether you need a single switch, or a three-tier non-blocking fat tree, our network engineers have it covered.
A one stop data center shop

Each cluster comes with all of the components racked up, plugged in, and properly labeled. It's shipped to you in a secured rack crate with integrated ramp for easy installation into your data center or co-location facility.
Engineered for you

Leverage Lambda engineering to design an Echelon cluster that's tailored to your specific deep learning workload.

GPU clusters designed for deep learning

Echelon clusters check all of the boxes

[✓] NVIDIA GPU Compute

[✓] Fast, All-Flash Storage

[✓] HPC Networking

[✓] Enterprise White Glove Support

A single vendor for your entire cluster

One relationship to rule them all

Compute, Storage, and Networking that works together

Shipped to you, ready to roll

A cluster of Lambda GPU servers

Designed for your team’s use case

Endlessly customizable

All-Flash Network Storage with Management Dashboard

Access your data sets and checkpoints at the speed of flash

Support for dozens of vendors

Manage your storage cluster with an easy-to-use web dashboard

Blazing fast HPC networking

100% port-to-port bandwidth spine & leaf topology

World class support

Get phone support directly from an AI infrastructure engineer

Support that covers your entire cluster

A fraction of the cost of cloud

Industry Defining TCO

Read the Echelon Whitepaper

Deep learning infrastructure for your data center

Multi-node distributed training

A one stop data center shop

Engineered for you