BERT is Google's SOTA pre-training language representations. This blog is about running BERT with multiple GPUs. Specifically, we will use the Horovod framework to parrallelize the tasks. We ill list all the changes to the original BERT implementation and highlight a few places that will make or break the performance.
The Lambda Deep Learning Blog
Categories
- gpu-cloud (25)
- tutorials (24)
- benchmarks (22)
- announcements (19)
- lambda cloud (13)
- NVIDIA H100 (12)
- hardware (12)
- tensorflow (9)
- NVIDIA A100 (8)
- gpus (8)
- company (7)
- LLMs (6)
- deep learning (6)
- hyperplane (6)
- training (6)
- gpu clusters (5)
- news (5)
- CNNs (4)
- generative networks (4)
- presentation (4)
- rtx a6000 (4)
Recent Posts
You'll learn how to provide your team with GPU training infrastructure at a variety of scales, from a single shared multi-GPU system to a cluster for distributed training.
Published 01/25/2019 by Stephen Balaban
...