Transformer model

The core architecture behind the LLMs. It uses Attention mechanism

Google’s T5 paper provides a unified framework to understand and train transformer models.

Tutorials and reviews

Implementations

See also Implementations

https://huggingface.co/blog/how-to-train shows how to train a transformer model from scratch. See also How to pretrain transformer models or A complete Hugging Face tutorial: how to build and train a vision transformer

Applications

They are used in other areas outside Language models, including Computer vision and Reinforcement learning (Decision transformer).