This presentation is a high-level overview of the different types of training regimes that you'll encounter as you move from single GPU to multi GPU to multi node distributed training. It briefly describes where the computation happens, how the gradients are communicated, and how the models are updated and communicated.
The Lambda Deep Learning Blog
Voltron Data Case Study: Why ML teams are using Lambda Reserved Cloud Clusters
November 01, 2022
How to fine tune stable diffusion: how we made the text-to-pokemon model at Lambda
September 28, 2022