I don’t see what about that 2017 Facebook comment from Yudkowsky you find particularly prophetic.
Is it the idea that deep learning models will be opaque? But that was fairly obvious back then too. I agree that Drexler likely exaggerated how transparent a system of AI services would be, so I’m willing to give Yudkowsky a point for that. But the rest of the scenario seems kind of unrealistic as of 2023.
Some specific points:
The recursive self-improvement that Yudkowsky talks about in this scenario seems too local. I think AI self-improvement will most likely take the form of AIs assisting AI researchers, with humans gradually becoming an obsolete part of the process, rather than a single neural net modifying parts of itself during training.
The whole thing about spinning off subagents during training just doesn’t seem realistic in our current paradigm. Maybe this could happen in the future, but it doesn’t look “prophetic” to me.
The idea that models will have “a little agent inside plotting” that takes over the whole system still seems totally speculative to me, and I haven’t seen any significant empirical evidence that this happens during real training runs.
I think gradient descent will generally select pretty hard for models that do impressive things, making me think it’s unlikely that AIs will naturally conceal their abilities during training. Again, this type of stuff is theoretically possible, but it seems very hard to call this story prophetic.
I said it was prophetic relative to Drexler’s Comprehensive AI Services. Elsewhere in this comment thread I describe some specific ways in which it is better, e.g. that the AI that takes over the world will be more well-described as one unified agent than as an ecosystem of services. I.e. exactly the opposite of what you said here, which I was reacting to: “And many readers can no doubt point out many non-trivial predictions that Drexler got right, such as the idea that we will have millions of AIs, rather than just one huge system that acts as a unified entity.”
Here are some additional ways in which it seems better + responses to the points you made, less important than the unified agent thing which I’ve already discussed:
Yudkowsky’s story mostly takes place inside a single lab instead of widely distributed across the economy. Of course Yudkowsky’s story presumably has lots of information transfer going in both directions (e.g.in the story China and Russia steal the code) but still, enough of the action takes place in one place that it’s coherent to tell the story as happening inside one place. As timelines shorten and takeoff draws near, this is seeming increasingly likely. Maybe if takeoff happens in 2035 the world will look more like what Drexler predicted, but if it happens in 2026 then it’s gonna be at one of the big labs.
We all agree that AI self-improvement currently takes the form of AIs assisting AI researchers and will gradually transition to more ‘local’ self-improvement where all of the improvement is coming from the AIs these. You should not list this as a point in favor of Drexler.
Spinning off subagents during training? How is that not realistic? When an AI is recursively self-improving and taking over the world, do you expect it to refrain from updating its weights?
Little agent inside plotting: Are you saying future AI systems will not contain subcomponents which are agentic? Why not? Big tech companies like Microsoft are explicitly trying to create agentic AGI! But since other companies are creating non-agentic services, the end result will be a combined system including agentic AGI + some nonagentic services.
Concealing abilities during training: Yeah you might be right about this one, but you also might be wrong, I feel like it could go either way at this point. I don’t think it’s a major point against Yudkowsky’s story, especially as compared to Drexler’s.
Remember that CAIS came out in 2019 whereas the Yud story I linked was from 2017. That makes it even more impressive that it was so much more predictive.
Note that Drexler has for many years now forecast singularity in 2026; I give him significant credit for that prediction. And I think CAIS is great in many ways. I just think it’s totally backwards to assign it more forecasting points than Yudkowsky.
I don’t see what about that 2017 Facebook comment from Yudkowsky you find particularly prophetic.
Is it the idea that deep learning models will be opaque? But that was fairly obvious back then too. I agree that Drexler likely exaggerated how transparent a system of AI services would be, so I’m willing to give Yudkowsky a point for that. But the rest of the scenario seems kind of unrealistic as of 2023.
Some specific points:
The recursive self-improvement that Yudkowsky talks about in this scenario seems too local. I think AI self-improvement will most likely take the form of AIs assisting AI researchers, with humans gradually becoming an obsolete part of the process, rather than a single neural net modifying parts of itself during training.
The whole thing about spinning off subagents during training just doesn’t seem realistic in our current paradigm. Maybe this could happen in the future, but it doesn’t look “prophetic” to me.
The idea that models will have “a little agent inside plotting” that takes over the whole system still seems totally speculative to me, and I haven’t seen any significant empirical evidence that this happens during real training runs.
I think gradient descent will generally select pretty hard for models that do impressive things, making me think it’s unlikely that AIs will naturally conceal their abilities during training. Again, this type of stuff is theoretically possible, but it seems very hard to call this story prophetic.
I said it was prophetic relative to Drexler’s Comprehensive AI Services. Elsewhere in this comment thread I describe some specific ways in which it is better, e.g. that the AI that takes over the world will be more well-described as one unified agent than as an ecosystem of services. I.e. exactly the opposite of what you said here, which I was reacting to: “And many readers can no doubt point out many non-trivial predictions that Drexler got right, such as the idea that we will have millions of AIs, rather than just one huge system that acts as a unified entity.”
Here are some additional ways in which it seems better + responses to the points you made, less important than the unified agent thing which I’ve already discussed:
Yudkowsky’s story mostly takes place inside a single lab instead of widely distributed across the economy. Of course Yudkowsky’s story presumably has lots of information transfer going in both directions (e.g.in the story China and Russia steal the code) but still, enough of the action takes place in one place that it’s coherent to tell the story as happening inside one place. As timelines shorten and takeoff draws near, this is seeming increasingly likely. Maybe if takeoff happens in 2035 the world will look more like what Drexler predicted, but if it happens in 2026 then it’s gonna be at one of the big labs.
We all agree that AI self-improvement currently takes the form of AIs assisting AI researchers and will gradually transition to more ‘local’ self-improvement where all of the improvement is coming from the AIs these. You should not list this as a point in favor of Drexler.
Spinning off subagents during training? How is that not realistic? When an AI is recursively self-improving and taking over the world, do you expect it to refrain from updating its weights?
Little agent inside plotting: Are you saying future AI systems will not contain subcomponents which are agentic? Why not? Big tech companies like Microsoft are explicitly trying to create agentic AGI! But since other companies are creating non-agentic services, the end result will be a combined system including agentic AGI + some nonagentic services.
Concealing abilities during training: Yeah you might be right about this one, but you also might be wrong, I feel like it could go either way at this point. I don’t think it’s a major point against Yudkowsky’s story, especially as compared to Drexler’s.
Remember that CAIS came out in 2019 whereas the Yud story I linked was from 2017. That makes it even more impressive that it was so much more predictive.
Note that Drexler has for many years now forecast singularity in 2026; I give him significant credit for that prediction. And I think CAIS is great in many ways. I just think it’s totally backwards to assign it more forecasting points than Yudkowsky.