There are several ways to explain and diagram transformers, some links that were very helpful for my understanding:
https://blog.nelhage.com/post/transformers-for-software-engineers/https://dugas.ch/artificial_curiosity/GPT_architecture.htmlhttps://peterbloem.nl/blog/transformershttp://nlp.seas.harvard.edu/annotated-transformer/https://sebastianraschka.com/blog/2023/self-attention-from-scratch.htmlhttps://github.com/markriedl/transformer-walkthrough?ref=jeremyjordan.mehttps://francescopochetti.com/a-visual-deep-dive-into-the-transformers-architecture-turning-karpathys-masterclass-into-pictures/https://jalammar.github.io/illustrated-transformer/https://e2eml.school/transformers.htmlhttps://jaykmody.com/blog/attention-intuition/https://eugeneyan.com/writing/attention/https://www.jeremyjordan.me/attention/
Many thanks!
There are several ways to explain and diagram transformers, some links that were very helpful for my understanding:
https://blog.nelhage.com/post/transformers-for-software-engineers/
https://dugas.ch/artificial_curiosity/GPT_architecture.html
https://peterbloem.nl/blog/transformers
http://nlp.seas.harvard.edu/annotated-transformer/
https://sebastianraschka.com/blog/2023/self-attention-from-scratch.html
https://github.com/markriedl/transformer-walkthrough?ref=jeremyjordan.me
https://francescopochetti.com/a-visual-deep-dive-into-the-transformers-architecture-turning-karpathys-masterclass-into-pictures/
https://jalammar.github.io/illustrated-transformer/
https://e2eml.school/transformers.html
https://jaykmody.com/blog/attention-intuition/
https://eugeneyan.com/writing/attention/
https://www.jeremyjordan.me/attention/
Many thanks!