GPT-2 cost an estimated $43,000 to train in 2019; today it is possible to train a 124M parameter GPT-2 for $20 in 90 minutes.
Isn’t $43,000 the estimate for the 1.5B replication of GPT-2 rather than for the 124M? If so, this phrasing is somewhat misleading. We only need $250 even for the 1.5B version, but still.
Good catch, I think we are indeed mixing the sizes here.
As you say, the point still stands, but we will change it in the next minor update to either compare the same size or make the difference in size explicit.
From footnote 2 to The state of AI today:
Isn’t $43,000 the estimate for the 1.5B replication of GPT-2 rather than for the 124M? If so, this phrasing is somewhat misleading. We only need $250 even for the 1.5B version, but still.
Good catch, I think we are indeed mixing the sizes here.
As you say, the point still stands, but we will change it in the next minor update to either compare the same size or make the difference in size explicit.
Now addressed in the latest patch!