Lambda just launched its RTX 3090, RTX 3080, and RTX 3070 deep learning workstation. If you're thinking of building your own 30XX workstation, read on. In this post, we discuss the size, power, cooling, and performance of these new GPUs. But first, we'll answer the most common question:
Your workstation should not exceed:
* PCIe extendors introduce structural problems and shouldn't be used if you plan on moving (especially shipping) the workstation.
The RTX 3090’s dimensions are quite unorthodox: it occupies 3 PCIe slots and its length will prevent it from fitting into many PC cases. The RTX 3070 and RTX 3080 are of standard size, similar to the RTX 2080 Ti.
* OEMs like PNY, ASUS, GIGABYTE, and EVGA will release their own 30XX series GPU models. Several upcoming RTX 3080 and RTX 3070 models will occupy 2.7 PCIe slots.
The 3000 series GPUs consume far more power than previous generations:
For reference, the RTX 2080 Ti consumes 250W.
Your workstation's power draw must not exceed the capacity of its PSU or the circuit it’s plugged into.
Circuit limitations
The above analysis suggest the following limits:
As an example, let’s see why a workstation with four RTX 3090s and a high end processor is impractical:
The GPUs + CPU + motherboard consume 1760W, far beyond the 1440W circuit limit.
PSU limitations
The highest rated workstation PSU on the market offers at most 1600W at standard home/office voltages. Workstation PSUs beyond this capacity are impractical because they would overload many circuits. Even if your home/office has higher amperage circuits, we recommend against workstations exceeding 1440W. A PSU may have a 1600W rating, but Lambda sees higher rates of PSU failure as workstation power consumption approaches 1500W.
Yes, though we don't recommend them:
Warning: Consult an electrician before modifying your home or office’s electrical setup.
Lambda's cooling recommendations for 1x, 2x, 3x, and 4x GPU workstations:
RTX 3090
RTX 3080
RTX 3070
Blower cards pull air from inside the chassis and exhaust it out the rear of the case; this contrasts with standard cards that expel hot air into the case. Here's what they look like:
Blower cards are currently facing thermal challenges due to the 3000 series' high power consumption. We fully expect RTX 3070 blower cards, but we're less certain about the RTX 3080 and RTX 3090.
Liquid cooling will reduce noise and heat levels. It is currently unclear whether liquid cooling is worth the increased cost, complexity, and failure rates. We will be testing liquid cooling in the coming months and update this section accordingly.
When a GPU's temperature exceeds a predefined threshold, it will automatically downclock (throttle) to prevent heat damage. Downclocking manifests as a slowdown of your training throughput. With multi-GPU setups, if cooling isn't properly managed, throttling is a real possibility. Lambda has designed its workstations to avoid throttling, but if you're building your own, it may take quite a bit of trial-and-error before you get the performance you want.
The new RTX 3000 series provides a number of improvements that will lead to what we expect to be an extremely impressive jump in performance. It is expected to be even more pronounced on a FLOPs per $ basis.
We don’t have 3rd party benchmarks yet (we’ll update this post when we do). However, we do expect to see quite a leap in performance for the RTX 3090 vs the RTX 2080 Ti since it has more than double the number of CUDA cores at just over 10,000! We also expect very nice bumps in performance for the RTX 3080 and even RTX 3070 over the 2080 Ti.
On the surface we should expect the RTX 3000 GPUs to be extremely cost effective. Even at $1,499 for the Founders Edition the 3090 delivers with a massive 10496 CUDA cores and 24GB of VRAM. While on the low end we expect the 3070 at only $499 with 5888 CUDA cores and 8 GB of VRAM will deliver comparable deep learning performance to even the previous flagship 2080 Ti for many models. We’ll be updating this section with hard numbers as soon as we have the cards in hand.
During parallelized deep learning training jobs inter-GPU and GPU-to-CPU bandwidth can become a major bottleneck. PCIe 4.0 doubles the theoretical bidirectional throughput of PCIe 3.0 from 32 GB/s to 64 GB/s and in practice on tests with other PCIe Gen 4.0 cards we see roughly a 54.2% increase in observed throughput from GPU-to-GPU and 60.7% increase in CPU-to-GPU throughput.
The RTX 3090 is the only one of the new GPUs to support NVLink. While we don’t have the exact specs yet, if it supports the same number of NVLink connections as the recently announced A100 PCIe GPU you can expect to see 600 GB / s of bidirectional bandwidth vs 64 GB / s for PCIe 4.0 between a pair of 3090s.
However, it’s important to note that while they will have an extremely fast connection between them it does not make the GPUs a single “super GPU.” You will still have to write your models to support multiple GPUs.
It’s important to take into account available space, power, cooling, and relative performance into account when deciding what cards to include in your next deep learning workstation. The biggest issues you will face when building your workstation will be:
It’s definitely possible build one of these workstations yourself, but if you’d like to avoid the hassle and have it preinstalled with the drivers and frameworks you need to get started we have verified and tested workstations with: up to 2x RTX 3090s, 2x RTX 3080s, or 4x RTX 3070s.