This presentation is a high-level overview of the different types of training regimes you'll encounter as you move from single GPU to multi GPU to multi node distributed training. It describes where the computation happens, how the gradients are communicated, and how the models are updated and communicated.
The Lambda Deep Learning Blog
Subscribe
Categories
- gpu-cloud (29)
- tutorials (24)
- announcements (23)
- benchmarks (22)
- lambda cloud (17)
- NVIDIA H100 (16)
- hardware (12)
- gpus (9)
- tensorflow (9)
- NVIDIA A100 (8)
- gpu clusters (8)
- LLMs (7)
- company (7)
- deep learning (7)
- news (7)
- hyperplane (6)
- training (6)
- CNNs (4)
- NVIDIA GH200 (4)
- generative networks (4)
- machine learning (4)
- presentation (4)
- research (4)
- rtx a6000 (4)
- text generation (4)