This presentation is a high-level overview of the different types of training regimes you'll encounter as you move from single GPU to multi GPU to multi node distributed training. It describes where the computation happens, how the gradients are communicated, and how the models are updated and communicated.
The Lambda Deep Learning Blog
Featured Posts
Categories
- gpu-cloud (23)
- tutorials (23)
- benchmarks (21)
- announcements (14)
- lambda cloud (13)
- hardware (11)
- NVIDIA H100 (10)
- tensorflow (9)
- gpus (8)
- NVIDIA A100 (7)
- deep learning (6)
- hyperplane (6)
- training (6)
- LLMs (5)
- company (5)
- gpu clusters (5)
- CNNs (4)
- generative networks (4)
- news (4)
- presentation (4)
- rtx a6000 (4)
Recent Posts
OpenAI recently published a blog post on their GPT-2 language model. This tutorial shows you how to run the text generator code yourself.
Published 02/16/2019 by Stephen Balaban
A cost and speed comparison between the Lambda Hyperplane 8 V100 GPU Server and AWS p3 GPU instances. A very similar comparison to the DGX-1.
Published 02/11/2019 by Chuan Li
This blog tests how fast does ResNet9 (the fastest way to train a SOTA image classifier on Cifar10) run on Nvidia's Turing GPUs, including 2080 Ti and Titan RTX. We also include 1080 Ti as the baseline for comparison.
Published 01/07/2019 by Chuan Li
This tutorial demonstrates how to use a pre-trained model for transfer learning. The networks used in this tutorial include ResNet50, InceptionV4 and NasNet. The dataset is Stanford Dogs. Tensorflow implementation is provided.
Published 09/21/2018 by Chuan Li
This tutorial will walk you through the steps of building an image classification application with TensorFlow. We will also introduce you to a few building blocks for creating your own deep learning demos.
Published 09/13/2018 by Chuan Li