PyTorch
See also JAX (PyTorch is dead. Long live JAX).
For multiple GPUs, use distributed data parallel
Related libraries
Tutorials
Tips
- The shortest guide for pytorch training on GPUs
- How 🤗 Accelerate runs very large models thanks to PyTorch - How Hugging Face run very large models.