How to use FlashAttention-2 on Lambda Cloud, including H100 vs A100 benchmark results for training GPT-3-style models using the new model.
The Lambda Deep Learning Blog
Lambda Launches New Hyperplane Server with NVIDIA H100 GPUs and AMD EPYC 9004 series CPUs
September 07, 2023
Lambda Selected as 2023 Americas NVIDIA Partner Network Solution Integration Partner of the Year
April 04, 2023
Learn how to use mpirun to launch a LLaMA inference job across multiple cloud instances if you do not have a multi-GPU workstation or server.
Lambda and Hugging Face are collaborating on a 2-week sprint to fine-tune OpenAI's Whisper model in as many languages as possible.
RTX 4090 vs RTX 3090 benchmarks to assess deep learning training performance, including training throughput/$, throughput/watt, and multi-GPU scaling.
This article discusses the performance and scalability of H100 GPUs and the whys for upgrading your ML infrastructure with the H100 release from NVIDIA.
The goal of this tutorial is to give a summary of how to write and launch PyTorch distributed data parallel jobs across multiple nodes, with working examples with the torch.distributed.launch, torchrun and mpirun APIs.
This blog describes how to set up a Run:AI cluster on Lambda Cloud with one or multiple cloud instances.
While waiting for NVIDIA's next-generation consumer and professional GPUs, we decided to write a blog about the best GPU for Deep Learning currently available, as of March 2022.
NVIDIA® A40 GPUs are now available on Lambda Scalar servers. In this post, we benchmark the A40 with 48 GB of GDDR6 VRAM to assess its training performance using PyTorch and TensorFlow. We then compare it against the NVIDIA V100, RTX 8000, RTX 6000, and RTX 5000.
This post discusses the Total Cost of Ownership (TCO) for a variety of Lambda A100 servers and clusters. We calculate the TCO for individual Hyperplane-A100 servers, compare the cost with renting a AWS p4d.24xlarge instance, and walk through the cost of building and operating A100 clusters.
PyTorch benchmarks of the RTX A6000 and RTX 3090 for convnets and language models - both 32-bit and mix precision performance.
Chuan Li, PhD reviews GPT-3, the new NLP model from OpenAI. The technical overview covers how GPT-3 was trained, GPT-2 vs. GPT-3, and GPT-3 performance.
Scaling out deep learning infrastructure becomes easier with 16 NVIDIA Tesla V100 GPUs and preinstalled frameworks like TensorFlow, Keras, and PyTorch.