Thanks for putting this together Neel, I think you achieved your goal of making it fairly unintimidating.
One quick note: all of the links in this section are outdated. Perhaps you can update them.
Good (but hard) exercise: Code your own tiny GPT-2 and train it. If you can do this, I’d say that you basically fully understand the transformer architecture.
Thanks for putting this together Neel, I think you achieved your goal of making it fairly unintimidating.
One quick note: all of the links in this section are outdated. Perhaps you can update them.