I interpreted that Yudkowsky tweet (on GPT-3 coding a React app) differently than you, I think.
I thought it was pertaining to modularity-of-intelligence and (relatedly) singleton-vs-multipolar. Specifically, I gather that part of the AI-foom debate was that Hanson expected AGI source code to be immensely complex and come from the accumulation of lots of little projects trading modules and ideas with each other:
The idea that you could create human-level intelligence by just feeding raw data into the right math-inspired architecture is pure fantasy. You couldn’t build an effective cell or ecosystem or developed economy or most any complex system that way either—such things require not just good structure but also lots of good content.” -Robin Hanson in the AI-foom debate
By contrast, I think Yudkowsky was saying that AGI source code would be relatively simple and coherent and plausibly written by one team, and then that relatively simple source code would give rise to lots of different skills via learning. And then he was (justifiably IMO) claiming support from this example where OpenAI “fed raw data into the right math-inspired architecture” and wound up with a program that could do lots of seemingly-intelligent things that were not specifically included in the source code.
(Obviously I could be wrong, this is just my impression. Also, be warned that I mostly haven’t read the AI foom debate and could be mischaracterizing it.)
I agree that that was his object-level claim about GPT-3 coding a react app—that it’s relatively simple and coherent and can acquire lots of different skills via learning, vs being a collection of highly specialised modules. And of relevance to this post, the first is a way that intelligence improvements could be easy, and the second is the way they could be hard. Our ‘interpretation’ was more about making explicit what the observation about GPT-3 was,
GPT-3 is general enough that it can write a functioning app given a short prompt, despite the fact that it is a relatively unstructured transformer model with no explicitly coded representations for app-writing. The fact that GPT-3 is this capable suggests that ML models scale in capability and generality very rapidly with increases in computing power or minor algorithm improvements...
If we’d continued that summary, it would have said something like what you suggested, i.e.
GPT-3 is general enough that it can write a functioning app given a short prompt, despite the fact that it is a relatively unstructured transformer model with no explicitly coded representations for app-writing. The fact that GPT-3 is this capable suggests that ML models scale in capability and generality very rapidly with increases in computing power or minor algorithm improvements. This fast scaling into acquiring new capabilities, if it applies to HLMI, suggests that HLMI will also look like an initially small model that scales up and acquires lots of new capabilities as it takes in data, rather than a collection of specialized modules. If HLMI does behave this way (small model that scales up as it takes in data), that means marginal intelligence improvements will be easy at the HLMI level.
Which takes the argument all the way through to the conclusion. Presumably the other interpretation of the shorter thing that we wrote is that HLMI/AGI is going to be an ML model that looks a lot like GPT-3, so improvements will be easy because HLMI will be similar to GPT-3 and scale up like GPT-3 (whether AGI/HLMI is like current ML will be covered in a subsequent post on paths to HLMI), whereas what’s actually being focussed on is the general property of being a simple data-driven model vs complex collection of modules.
We address the modularity question directly in the ‘upper limit to intelligence’ section that discusses modularity of mind.
I interpreted that Yudkowsky tweet (on GPT-3 coding a React app) differently than you, I think.
I thought it was pertaining to modularity-of-intelligence and (relatedly) singleton-vs-multipolar. Specifically, I gather that part of the AI-foom debate was that Hanson expected AGI source code to be immensely complex and come from the accumulation of lots of little projects trading modules and ideas with each other:
By contrast, I think Yudkowsky was saying that AGI source code would be relatively simple and coherent and plausibly written by one team, and then that relatively simple source code would give rise to lots of different skills via learning. And then he was (justifiably IMO) claiming support from this example where OpenAI “fed raw data into the right math-inspired architecture” and wound up with a program that could do lots of seemingly-intelligent things that were not specifically included in the source code.
(Obviously I could be wrong, this is just my impression. Also, be warned that I mostly haven’t read the AI foom debate and could be mischaracterizing it.)
I agree that that was his object-level claim about GPT-3 coding a react app—that it’s relatively simple and coherent and can acquire lots of different skills via learning, vs being a collection of highly specialised modules. And of relevance to this post, the first is a way that intelligence improvements could be easy, and the second is the way they could be hard. Our ‘interpretation’ was more about making explicit what the observation about GPT-3 was,
If we’d continued that summary, it would have said something like what you suggested, i.e.
Which takes the argument all the way through to the conclusion. Presumably the other interpretation of the shorter thing that we wrote is that HLMI/AGI is going to be an ML model that looks a lot like GPT-3, so improvements will be easy because HLMI will be similar to GPT-3 and scale up like GPT-3 (whether AGI/HLMI is like current ML will be covered in a subsequent post on paths to HLMI), whereas what’s actually being focussed on is the general property of being a simple data-driven model vs complex collection of modules.
We address the modularity question directly in the ‘upper limit to intelligence’ section that discusses modularity of mind.