PyTorch
For multiple GPUs, use distributed data parallel
Related libraries
Tutorials
Tips
- The shortest guide for pytorch training on GPUs
- How 🤗 Accelerate runs very large models thanks to PyTorch - How Hugging Face run very large models.
For multiple GPUs, use distributed data parallel