The Lambda Deep Learning Blog

Featured Posts

Recent Posts

Careers at Lambda

Lambda is hiring! Join a fast growing startup providing deep learning hardware, software, and cloud services to the world's leading companies.

Published 06/26/2022 by Stephen Balaban

Lambda's Machine Learning Infrastructure Playbook and Best Practices

If you're trying to figure out how to build and scale your team's deep learning infrastructure, this presentation is for you. We walk you through the decisions associated with building cloud, on-prem, and hybrid infrastructure for your team. We've distilled best practices learned from helping thousands of teams build their

Published 02/23/2022 by Stephen Balaban

Deep learning is the future of gaming.

Deep learning is the most important technology to impact gaming since the advent of 3D graphics. This short video presentation walks you through a few of the technologies that will deliver unbelievable gaming experiences in the near future. Research covered in this presentation: 1. Photorealistic neural rendering 2. Deepfakes for

Published 01/04/2022 by Stephen Balaban

Lambda's Deep Learning Curriculum

This curriculum provides an overview of free online resources for learning about deep learning. It includes courses, books, and even important people to follow. If you only want to do one thing, do this: Train an MNIST network with PyTorch. https://github.com/pytorch/examples/tree/master/mnist Introductory CS231n:

Published 11/01/2021 by Stephen Balaban

NVIDIA NGC Tutorial: Run a PyTorch Docker Container using nvidia-container-toolkit on Ubuntu

Full Video TutorialThis tutorial shows you how to install Docker with GPU support on Ubuntu Linux. To get GPU passthrough to work, you'll need docker, nvidia-container-toolkit, Lambda Stack, and a docker image with a GPU accelerated library. 1) Install Lambda Stack LAMBDA_REPO=$(mktemp) && \ wget -O${LAMBDA_REPO} https://lambdalabs.

Published 07/19/2021 by Stephen Balaban

Lambda raises $24.5M to build GPU cloud and deep learning hardware

Lambda secured $24.5M in financing, including a $15M Series A equity round and a $9.5M debt facility that will allow for the growth of Lambda GPU Cloud and the expansion of Lambda's on-prem AI infrastructure software products. Read more details in the post.

Published 07/16/2021 by Stephen Balaban

Lambda Echelon – a turn key GPU cluster for your ML team

Introducing the Lambda EchelonLambda Echelon [https://lambdalabs.com/gpu-cluster/echelon] is a GPU cluster designed for AI. It comes with the compute, storage, network, power, and support you need to tackle large scale deep learning tasks. Echelon offers a turn-key solution to faster training, faster hyperparameter search, and faster inference.

Published 10/06/2020 by Stephen Balaban

NVIDIA A100 GPU Benchmarks for Deep Learning

Benchmarks for ResNet-152, Inception v3, Inception v4, VGG-16, AlexNet, SSD300, and ResNet-50 using the NVIDIA A100 GPU and DGX A100 server.

Published 05/22/2020 by Stephen Balaban

Hyperplane-16 InfiniBand Cluster Total Cost of Ownership Analysis

In this post we'll walk through using our Total Cost of Ownership (TCO) calculator to examine the cost of a variety of Lambda Hyperplane-16 clusters. We have the option to include 100 Gb/s EDR InfiniBand networking, storage servers, and complete rack-stack-label-cable service. The purpose of this post is to

Published 04/07/2020 by Stephen Balaban

Setting up a Mellanox InfiniBand Switch (SB7800 36-port EDR)

This tutorial will walk you through the steps required to set up a Mellanox SB7800 36-port switch. The subnet manager discovers and configures the devices running on the InfiniBand fabric. This tutorial will show you how to set it up via the command line or via the web browser.

Published 10/30/2019 by Stephen Balaban

A Gentle Introduction to Multi GPU and Multi Node Distributed Training

This presentation is a high-level overview of the different types of training regimes that you'll encounter as you move from single GPU to multi GPU to multi node distributed training. It briefly describes where the computation happens, how the gradients are communicated, and how the models are updated and communicated.

Published 05/31/2019 by Stephen Balaban

RTX 2080 Ti Deep Learning Benchmarks with TensorFlow

RTX 2080 Ti vs. RTX 2080 vs. Titan RTX vs. Tesla V100 vs. Titan V vs. GTX 1080 Ti vs. Titan Xp benchmarks neural net training.

Published 03/04/2019 by Stephen Balaban

Perform GPU, CPU, and I/O stress testing on Linux

CPU, GPU, and I/O utilization monitoring using tmux, htop, iotop, and nvidia-smi. This stress test is running on a Lambda GPU Cloud [https://lambdalabs.com/service/gpu-cloud] 4x GPU instance.Often times you'll want to put a system through the paces after it's been set up. To stress test

Published 02/17/2019 by Stephen Balaban

...

Next page