Using DistributedDataParallel to train a base model from scratch in the cloudComments on Hacker News | Source