Indeed, my median future involves a significant slowdown in dense-network parameter growth.
If there is a 32 trillion parameter dense model by 2023, I’ll be surprised and update towards shorter timelines, unless it turns out to be underwhelming compared to the performance predicted by the scaling trends.
Hard to say, it depends a lot on the rest of the details. If the performance is as good as the scaling trends would predict, it’ll be almost human-level at text prediction and multiple choice questions on diverse topics and so forth. After fine-tuning it would probably be a beast.
I suppose I’d update my 50% mark to, like, 2027 or so? IDK.
I got the idea. I would also update to a very short timeline (4-5 years) in the absence of slowdown in dense-network parameter growth l, and performance following the scaling trend.
And I was pretty scared when GPT-3 was released. As many here, I was expected further growth in that direction very soon which did not happen. So, I am less scared now.
This was all well before the Chinchilla scaling paper, but this has still turned out to be absolutely true by 2023. We have PaLM-E 540B just for starters.
Indeed, my median future involves a significant slowdown in dense-network parameter growth.
If there is a 32 trillion parameter dense model by 2023, I’ll be surprised and update towards shorter timelines, unless it turns out to be underwhelming compared to the performance predicted by the scaling trends.
What will be your new median? (If you observe 32 trillion parameter model in 2023)
Hard to say, it depends a lot on the rest of the details. If the performance is as good as the scaling trends would predict, it’ll be almost human-level at text prediction and multiple choice questions on diverse topics and so forth. After fine-tuning it would probably be a beast.
I suppose I’d update my 50% mark to, like, 2027 or so? IDK.
I got the idea. I would also update to a very short timeline (4-5 years) in the absence of slowdown in dense-network parameter growth l, and performance following the scaling trend. And I was pretty scared when GPT-3 was released. As many here, I was expected further growth in that direction very soon which did not happen. So, I am less scared now.
This was all well before the Chinchilla scaling paper, but this has still turned out to be absolutely true by 2023. We have PaLM-E 540B just for starters.