This is well-written, but I feel like it falls into the same problem a lot of AI-risk stories do. It follows this pattern:
Plausible (or at least not impossible) near-future developments in AI that could happen if all our current predictions pan out.
???
Nanotech-enabled fully-general superintelligence converts the universe into paperclips at a significant fraction of lightspeed.
And like, the Step 1 stuff is fascinating and a worthy sci-fi story on its own, but the big question everyone has about AI risk is “How does the AI get from Step 1 to Step 3?”
(This story does vaguely suggest why step 2 is missing—there are so many companies building AIs for the stock market or big tech or cyberwarfare that eventually one of them will stumble into self-improving AGI, which in turn will figure out nanotech—but implying the existence of an answer in-story is different from actually answering the question.)
Thanks! I agree with this critique. Note that Daniel also points out something similar in point 12 of his comment — see my response.
To elaborate a bit more on the “missing step” problem though:
I suspect many of the most plausible risk models have features that make it undesirable for them to be shared too widely. Please feel free to DM me if you’d like to chat more about this.
There will always be some point between Step 1 and Step 3 at which human-legible explanations fail. i.e., it would be extremely surprising if we could tell a coherent story about the whole process — the best we can do is assume the AI gets to the end state because it’s highly competent, but we should expect it to do things we can’t understand. (To be clear, I don’t think this is quite what your comment was about. But it is a fundamental reason why we can’t ever expect a complete explanation.)
Is it something like the AI-box argument? “If I share my AI breakout strategy, people will think ‘I just won’t fall for that strategy’ instead of noticing the general problem that there are strategies they didn’t think of”? I’m not a huge fan of that idea, but I won’t argue it further.
I’m not expecting a complete explanation, but I’d like to see a story that doesn’t skip directly to “AI can reformat reality at will” without at least one intermediate step. Like, this is the third time I’ve seen an author pull this trick and I’m starting to wonder if the real AI-safety strategy is “make sure nobody invents grey-goo nanotech.”
If you have a ball of nanomachines that can take over the world faster than anyone can react to it, it doesn’t really matter if it’s an AI or a human at the controls, as soon as it’s invented everyone dies. It’s not so much an AI-risk problem as it is a problem with technological progress in general. (Fortunately, I think it’s still up for debate whether it’s even possible to create grey-goo-style nanotech.)
I see — perhaps I did misinterpret your earlier comment. It sounds like the transition you are more interested in is closer to (AI has ~free rein over the internet) ⇒ (AI invents nanotech). I don’t think this is a step we should expect to be able to model especially well, but the best story/analogy I know of for it is probably the end part of That Alien Message. i.e., what sorts of approaches would we come up with, if all of human civilization was bent on solving the equivalent problem from our point of view?
If instead you’re thinking more about a transition like (AI is superintelligent but in a box) ⇒ (AI has ~free rein over the internet), then I’d say that I’d expect us to skip the “in a box” step entirely.
I really don’t get how you can go from being online to having a ball of nanomachines, truly. Imagine AI goes rogue today. I can’t imagine one plausible scenario where it can take out humanity without triggering any bells on the way, even without anyone paying attention to such things. But we should pay attention to the bells, and for that we need to think of them. What the signs might look like? I think it’s really, really counterproductive to not take that into account at all and thinking all is lost if it fooms. It’s not lost. It will need humans, infrastructure, money (which is very controllable) to accomplish its goals. Governments already pay a lot of attention to their adversaries who are trying to do similar things and counteract them semi-successfully. Any reason why they can’t do the same to a very intelligent AI? Mind you, if your answer is to simulate and just do what it takes, true to life simulations will take a lot of compute and time; that won’t be available from the start. We should stop thinking of rogue AI as God, it would only help it accomplish it’s goals.
I agree, since it’s hard to imagine for me how could step 2 look like. Maybe you or anyone else has any content on that? See this post—it didn’t seem to get a lot of traction or any meaningful answers, but I still think this question is worth answering.
This is well-written, but I feel like it falls into the same problem a lot of AI-risk stories do. It follows this pattern:
Plausible (or at least not impossible) near-future developments in AI that could happen if all our current predictions pan out.
???
Nanotech-enabled fully-general superintelligence converts the universe into paperclips at a significant fraction of lightspeed.
And like, the Step 1 stuff is fascinating and a worthy sci-fi story on its own, but the big question everyone has about AI risk is “How does the AI get from Step 1 to Step 3?”
(This story does vaguely suggest why step 2 is missing—there are so many companies building AIs for the stock market or big tech or cyberwarfare that eventually one of them will stumble into self-improving AGI, which in turn will figure out nanotech—but implying the existence of an answer in-story is different from actually answering the question.)
Thanks! I agree with this critique. Note that Daniel also points out something similar in point 12 of his comment — see my response.
To elaborate a bit more on the “missing step” problem though:
I suspect many of the most plausible risk models have features that make it undesirable for them to be shared too widely. Please feel free to DM me if you’d like to chat more about this.
There will always be some point between Step 1 and Step 3 at which human-legible explanations fail. i.e., it would be extremely surprising if we could tell a coherent story about the whole process — the best we can do is assume the AI gets to the end state because it’s highly competent, but we should expect it to do things we can’t understand. (To be clear, I don’t think this is quite what your comment was about. But it is a fundamental reason why we can’t ever expect a complete explanation.)
Is it something like the AI-box argument? “If I share my AI breakout strategy, people will think ‘I just won’t fall for that strategy’ instead of noticing the general problem that there are strategies they didn’t think of”? I’m not a huge fan of that idea, but I won’t argue it further.
I’m not expecting a complete explanation, but I’d like to see a story that doesn’t skip directly to “AI can reformat reality at will” without at least one intermediate step. Like, this is the third time I’ve seen an author pull this trick and I’m starting to wonder if the real AI-safety strategy is “make sure nobody invents grey-goo nanotech.”
If you have a ball of nanomachines that can take over the world faster than anyone can react to it, it doesn’t really matter if it’s an AI or a human at the controls, as soon as it’s invented everyone dies. It’s not so much an AI-risk problem as it is a problem with technological progress in general. (Fortunately, I think it’s still up for debate whether it’s even possible to create grey-goo-style nanotech.)
I see — perhaps I did misinterpret your earlier comment. It sounds like the transition you are more interested in is closer to (AI has ~free rein over the internet) ⇒ (AI invents nanotech). I don’t think this is a step we should expect to be able to model especially well, but the best story/analogy I know of for it is probably the end part of That Alien Message. i.e., what sorts of approaches would we come up with, if all of human civilization was bent on solving the equivalent problem from our point of view?
If instead you’re thinking more about a transition like (AI is superintelligent but in a box) ⇒ (AI has ~free rein over the internet), then I’d say that I’d expect us to skip the “in a box” step entirely.
I really don’t get how you can go from being online to having a ball of nanomachines, truly.
Imagine AI goes rogue today. I can’t imagine one plausible scenario where it can take out humanity without triggering any bells on the way, even without anyone paying attention to such things.
But we should pay attention to the bells, and for that we need to think of them. What the signs might look like?
I think it’s really, really counterproductive to not take that into account at all and thinking all is lost if it fooms. It’s not lost.
It will need humans, infrastructure, money (which is very controllable) to accomplish its goals. Governments already pay a lot of attention to their adversaries who are trying to do similar things and counteract them semi-successfully. Any reason why they can’t do the same to a very intelligent AI?
Mind you, if your answer is to simulate and just do what it takes, true to life simulations will take a lot of compute and time; that won’t be available from the start.
We should stop thinking of rogue AI as God, it would only help it accomplish it’s goals.
I agree, since it’s hard to imagine for me how could step 2 look like. Maybe you or anyone else has any content on that?
See this post—it didn’t seem to get a lot of traction or any meaningful answers, but I still think this question is worth answering.