My recall: optimization on predicting the next token finds intelligent schemes that can be coordinated to go further than the humans that produced the tokens in the first place. Think about GPT-n being at least as smart and knowledgeable as the best human in every specialized domain, and then the combination of all this abilities at once allowing it to go further than any single human or coordinated group of humans.
Ilya Sutskever says something about this in an interview:
https://www.youtube.com/watch?v=Yf1o0TQzry8
My recall: optimization on predicting the next token finds intelligent schemes that can be coordinated to go further than the humans that produced the tokens in the first place. Think about GPT-n being at least as smart and knowledgeable as the best human in every specialized domain, and then the combination of all this abilities at once allowing it to go further than any single human or coordinated group of humans.