2M: PyTorch Distributed Data Parallel
Code
Note
This directory contains the serial PyTorch code for training a simple classifier on MNIST (mnist_classify.py). The exercise is to modify this code and add DDP functionality to it, either assuming it will be run with the srun parallel launcher, or (as a bonus exercise) the torchrun parallel launcher. The answers are already present in this directory, but the exercise is to start from mnist_classify.py and try to modify the code yourself.