I was chatting with a friend of mine who works in the AI space. He said that the big thing that got them to GPT-4 was the data set; which was basically the entire internet. But now that they’ve given it the entire internet, there’s no easy way for them to go further along that axis;; that the next big increase in capabilities would require a significantly different direction than “more text / more parameters / more compute”.
I’d have to disagree with this assessment. Ilya Sutskever recently said that they’ve not run out of data yet. They might some day, but not yet. And Epoch projects high-quality text data to run out in 2024, with all text data running out in 2040.
I was chatting with a friend of mine who works in the AI space. He said that the big thing that got them to GPT-4 was the data set; which was basically the entire internet. But now that they’ve given it the entire internet, there’s no easy way for them to go further along that axis;; that the next big increase in capabilities would require a significantly different direction than “more text / more parameters / more compute”.
I’d have to disagree with this assessment. Ilya Sutskever recently said that they’ve not run out of data yet. They might some day, but not yet. And Epoch projects high-quality text data to run out in 2024, with all text data running out in 2040.