[[ build.model.nick ]]

Choose the price, not the parts. Each model is built with the GPUs, CPU, RAM, and storage that maximizes Deep Learning performance per dollar.

[[ build.model.image_alt ]]

[[ build.model.nick ]]

Basic

OS Ubuntu 18.04 + Lambda Stack
GPUs 8x NVIDIA 1080 Ti
CPU 2x Intel Xeon E5-2650
Memory 128 GB memory
STORAGE 2 TB SATA SSD
EXTRA 4 TB HDD
NETWORK 10 Gbps ethernet

Premium

OS Ubuntu 18.04 + Lambda Stack
GPUs 8x NVIDIA RTX 2080 Ti
CPU 2x Intel Xeon E5-2650
Memory 256 GB memory
STORAGE 4 TB SATA SSD
EXTRA 4 TB HDD
NETWORK 10 Gbps ethernet

Max

OS Ubuntu 18.04 + Lambda Stack
GPUs 8x NVIDIA Titan RTX
CPU 2x Intel Xeon E5-2650
Memory 512 GB memory
STORAGE 2 TB NVME SSD
EXTRA 8 TB RAID 5 (3x 4 TB SSD)
NETWORK 10 Gbps ethernet

Customize

Not seeing what you want?

Add a Protection Plan

3-year protection for [[ getWarrantyPrice(true) ]]

[[ getSubtotal(true)]]
Talk to an engineer
(650) 479-5530

About the [[ build.model.title ]] Basic

GPUs

GPUs are the most critical piece of hardware for Deep Learning. The Basic has 8x NVIDIA GTX 1080 Ti GPUs (Pascal Architecture). Each 1080 Ti has 11.3 TFLOPs of FP32 performance (the standard precision for Deep Learning training) and 11 GB of memory. For most Machine Learning tasks, the 1080 Ti is 75% as fast as the RTX 2080 Ti. We estimate it will be approximately 70% as fast as the Titan RTX.

Processor

During training, the CPUs preprocess data and feed it to the GPUs. Slow processors will cause the GPUs to waste cycles waiting for this data. Core count and PCIe lane count are important CPU performance factors. More cores means faster data preprocessing; more PCIe lanes means faster transmission of that data to the GPUs. The Basic has two Intel Xeon E5-2650 v4 (12 cores, 40x PCIe lanes, each). Its core-to-GPU ratio is 3, which follows the best practice of at least 1 CPU core per GPU. The Basic's CPUs, combined with its PLX-enabled motherboard, provide 16x PCIe lanes to each GPU (the max possible).

Motherboard

A motherboard's PCIe topology significantly impacts Deep Learning performance. PCIe lanes are data pipes that enable communication amongst the GPUs and CPU. The number of PCIe lanes attached to a given device can range from 1 to 16. More lanes is better: for example, a device with 16 PCIe lanes can send data faster than a device with 4. When training a neural net, the GPUs and CPU send huge amounts of data to each other. To ensure speedy communication, the Basic's motherboard provides each GPU with 16x PCIe lanes, which is the highest of any motherboard as of 2018.

Memory

A server designed for A.I. workloads should have at least as much RAM as GPU memory. According to this rule, a machine like the Basic, which has 8x GTX 1080 Ti GPUs, should have at least 88 GB of memory (8 GPUs * 11 GB memory per GPU). The Basic has 128 GB of memory, which provides plenty of overhead. If you work with very large data (e.g. radiological) or train with large batch sizes, consider the Premium or Max servers, which have more memory.

Storage

Most datasets do not fit in RAM. In such cases, during model training, subsets must be repeatedly swapped in and out of RAM from nonvolatile storage. Such a pipeline requires fast, solid state storage; without it, the GPUs would waste cycles waiting for their next batch of data. The Basic was designed with this constraint in mind; it has two nonvolatile storage devices: a 2 TB solid state drive (fast) for data you're training on now, and a 4 TB hard disk drive (slower) for everything else. Files located in the /data directory are stored on the HDD; all other files are stored on the SSD.

Network

The Basic has 10 Gbps ethernet. Your ISP will almost certainly be the bottleneck. The main benefit of 10 Gbps ethernet (as opposed to the standard 1 Gbps) is fast file transfers between the computers on your network. Multi-node distributed training requires at least 40 Gbps (Infiniband territory).

Who bought a Basic?

About the [[ build.model.title ]] Premium

GPUs

GPUs are the most critical piece of hardware for Deep Learning. The Premium has 8x RTX 2080 Ti GPUs (Turing Architecture). Each RTX 2080 Ti has 13.4 TFLOPs of FP32 performance (the standard precision for Deep Learning training) and 11 GB of memory. Our benchmarks show that the 2080 Ti is approximately 30% faster than the previous generation GTX 1080 Ti. We estimate that it will be 85% as fast as the Titan RTX. If you need lots of GPU memory (e.g. because you train with large batch sizes), consider the Lambda Blade Max, which uses the 24 GB Titan RTX GPUs.

Processor

During training, the CPUs preprocess data and feed it to the GPUs. Slow processors will cause the GPUs to waste cycles waiting for this data. Core count and PCI-e lane count are important CPU performance factors. More cores means faster data preprocessing; more PCI-e lanes means faster transmission of that data to the GPUs. The Premium has two Intel Xeon E5-2650 v4 (12 cores, 40x PCI-e lanes, each). Its core-to-GPU ratio is 3, which follows the best practice of at least 1 CPU core per GPU. The Premium's CPUs, combined with its PLX-enabled motherboard, provide 16x PCI-e lanes to each GPU (the max possible).

Motherboard

A motherboard's PCI-e topology significantly impacts Deep Learning performance. PCI-e lanes are data pipes that enable communication amongst the GPUs and CPU. The number of PCI-e lanes attached to a given device can range from 1 to 16. More lanes is better: for example, a device with 16 PCI-e lanes can send data faster than a device with 4. When training a neural net, the GPUs and CPU send huge amounts of data to each other. To ensure speedy communication, the Premium's motherboard provides each GPU with 16x PCI-e lanes, which is the highest of any motherboard as of 2018.

Memory

A server for Machine Learning should have at least as much RAM as GPU memory. According to this rule, a machine like the Premium, which has eight RTX 2080 Ti GPUs, should have at least 88 GB of memory (8 GPUs * 11 GB memory per GPU). The Premium has 256 GB of memory, which provides plenty of overhead. If you work with extremely large data (e.g. radiological) or train with very large batch sizes, consider the Max server, which has more memory than the Premium.

Storage

Most datasets do not fit in RAM. In such cases, during model training, subsets must be repeatedly swapped in and out of RAM from nonvolatile storage. Such a pipeline requires fast, solid state storage; without it, the GPUs would waste cycles waiting for their next batch of data. The Premium was designed with this constraint in mind; it has two nonvolatile storage devices: a 4 TB solid state drive (fast) for data you're training on now, and a 4 TB hard disk drive (slower) for everything else. Files located in the /data directory are stored on the HDD; all other files are stored on the SSD.

Network

The Premium has 10 Gbps ethernet. You're ISP will almost certainly be the bottleneck. The main benefit of 10 Gbps ethernet (as opposed to the standard 1 Gbps) is fast file transfers between the computers your network. Multi-node distributed training requires at least 40 Gbps (Infiniband territory).

Who bought a Premium?

About the [[ build.model.title ]] Max

GPUs

GPUs are the most critical piece of hardware for Deep Learning. The Max has 8x Titan RTX GPUs (Turing Architecture). We estimate that the Titan RTX will be 10-15% faster than the RTX 2080 Ti and 35-45% faster than the GTX 1080 Ti for most AI and Machine Learning tasks. The Titan RTX has 24 GB of memory: more than twice that of the GTX 1080 Ti and RTX 2080 Ti. If you train with large batch sizes or large data types (e.g. images) and need lots of GPU memory, the Titan RTX is the best priced GPU on the market.

Processor

During training, the CPUs preprocess data and feed it to the GPUs. Slow processors will cause the GPUs to waste cycles waiting for this data. Core count and PCIe lane count are important CPU performance factors. More cores means faster data preprocessing; more PCIe lanes means faster transmission of that data to the GPUs. The Max has two Intel Xeon E5-2650 v4 (12 cores, 40x PCIe lanes, each). Its core-to-GPU ratio is 3, which follows the best practice of at least 1 CPU core per GPU. The Max's CPUs, combined with its PLX-enabled motherboard, provide 16x PCIe lanes to each GPU (the max possible).

Motherboard

A motherboard's PCIe topology significantly impacts Deep Learning performance. PCIe lanes are data pipes that enable communication amongst the GPUs and CPU. The number of PCIe lanes attached to a given device can range from 1 to 16. A device with 16 PCIe lanes can send data faster than a device with 4. When training a neural net, the GPUs and CPU send huge amounts of data to each other. To ensure speedy communication, the Max's motherboard provides each GPU with 16x PCIe lanes, which is the highest of any motherboard as of 2018.

Memory

A Machine Learning server should have at least as much RAM as GPU memory. According to this rule, a machine like the Max, which has eight Titan RTX GPUs, should have at least 192 GB of memory (8 GPUs * 24 GB memory per GPU). The Max has 512 GB of memory, which provides plenty of overhead. If you work with very large data (e.g. radiological) or train with large batch sizes, 512 GB of memory is standard.

Storage

Most datasets do not fit in RAM. In such cases, during model training, subsets must be repeatedly swapped in and out of RAM from nonvolatile storage. Such a pipeline requires fast, solid state storage; without it, the GPUs would waste cycles waiting for their next batch of data. The Max was designed with this constraint in mind; it has two nonvolatile storage devices: a 2 TB NVME (extremely fast) for data you're training on now, and an 8 TB RAID 5 (3x 4 TB SATA SSDs) for everything else. Files located in the /data directory are stored on the RAID; all other files are stored on the NVMe SSD.

Network

The Max has 10 Gbps ethernet. Your ISP will almost certainly be the bottleneck. The main benefit of 10 Gbps ethernet (as opposed to the standard 1 Gbps) is fast file transfers between the computers on your network. Multi-node distributed training requires at least 40 Gbps (Infiniband territory).

Who bought a Max?

[[ component.name ]]

[[ option.description ]] [[ build.getPriceDiff(component, option)]]