[Question] What are the most important papers/post/resources to read to understand more of GPT-3?

adamShimiAug 2, 2020, 8:53 PM

LW: 22 AF: 6

I’m way more used to thinking about weird maths or distributed algorithms or abstract philosophical problems than about concrete machine learning architectures. But based on everything I see about GPT-3, it seems a nice idea to learn more about it, even if only for participating in the discussion without spouting non-sense.

So I’m asking for what you think are the must-reads on GPT-3 specifically, and maybe any requirement to understand them.

adamShimiAug 2, 2020, 8:53 PM

LW: 22 AF: 6

4 comments1 min readLW link

AI GPT Machine Learning (ML)

Peter Jin Aug 3, 2020, 1:26 AM
LW: 13 AF: 6
AF

nostalgebraist’s blog is a must-read regarding GPT-x, including GPT-3. Perhaps, start here (“the transformer… ‘explained’?”), which helps to contextualize GPT-x within the history of machine learning.

(Though, I should note that nostalgebraist holds a contrarian “bearish” position on GPT-3 in particular; for the “bullish” case instead, read Gwern.)
- adamShimi Aug 3, 2020, 6:39 PM
  LW: 3 AF: 2
  AF Parent
  
  Thanks for the answer! I knew about the “transformer explained” post, but I was not aware of its author’s position on GPT-3.
Juraj Vitko Aug 4, 2020, 10:51 AM
LW: 6 AF: 3
AF

Here’s a list of resources that may be of use to you. The GPT-3 paper isn’t too specific on implementation details because the changes that led to it were rather incremental (especially from GPT-2, and more so the farther back we look at the Transformer lineage). So the scope to understand GPT-3 is broader than one might expect.
- adamShimi Aug 10, 2020, 3:03 PM
  LW: 1 AF: 1
  AF Parent
  
  Thanks! I’ll try to read that.

No comments.

[Question] What are the most important papers/​post/​resources to read to understand more of GPT-3?

[Question] What are the most important papers/post/resources to read to understand more of GPT-3?