I don’t understand why Eliezer changed his perspective about the current approach of Transformer next-token prediction not being the path towards AGI. It should not be surprising that newer versions of GPT will asymptotically approach (mimicry) of AGI, but that shouldn’t convince anyone that they are going to break through that barrier without a change in paradigm. All the intelligent organisms we know of do not have imitation as their primary optimization objective—their objective function is basically to survive or avoid pain. As a result, they of course form sub-goals which might include imitation, but only to the extent that it is instrumental to their survival. Optimizing 100% for imitation does not lead to AGI, because how can novelty emerge from nothing but imitation.
My recall: optimization on predicting the next token finds intelligent schemes that can be coordinated to go further than the humans that produced the tokens in the first place. Think about GPT-n being at least as smart and knowledgeable as the best human in every specialized domain, and then the combination of all this abilities at once allowing it to go further than any single human or coordinated group of humans.
I don’t understand why Eliezer changed his perspective about the current approach of Transformer next-token prediction not being the path towards AGI. It should not be surprising that newer versions of GPT will asymptotically approach (mimicry) of AGI, but that shouldn’t convince anyone that they are going to break through that barrier without a change in paradigm. All the intelligent organisms we know of do not have imitation as their primary optimization objective—their objective function is basically to survive or avoid pain. As a result, they of course form sub-goals which might include imitation, but only to the extent that it is instrumental to their survival. Optimizing 100% for imitation does not lead to AGI, because how can novelty emerge from nothing but imitation.
Ilya Sutskever says something about this in an interview:
https://www.youtube.com/watch?v=Yf1o0TQzry8
My recall: optimization on predicting the next token finds intelligent schemes that can be coordinated to go further than the humans that produced the tokens in the first place. Think about GPT-n being at least as smart and knowledgeable as the best human in every specialized domain, and then the combination of all this abilities at once allowing it to go further than any single human or coordinated group of humans.