Sequence Models & Attention From RNNs to the Transformer revolution — the architectures that made large language models possible. Topics (5) Beginner Attention Mechanism Multi-Head Attention RNNs & LSTMs Self-Attention Transformer Architecture