Transformer Videos from StatQuest

Transformer Videos

Attention is one of the most important concepts behind Transformers and Large Language Models, like ChatGPT. However, it's not that complicated. In this StatQuest, we add Attention to a basic Sequence-to-Sequence (Seq2Seq or Encoder-Decoder) model and walk through how it works and is calculated, one step at a time. BAM!!! Self-Attention

https://youtu.be/PSs6nxngL6k

Transformer Neural Networks are the heart of pretty much everything exciting in AI right now. ChatGPT, Google Translate and many other cool things, are based on Transformers. This StatQuest cuts through all the hype and shows you how a Transformer works, one-step-at-a time.

https://www.youtube.com/watch?v=zxQyTK8quyY

https://www.youtube.com/watch?v=bQ5BoolX9Ag