If an AI does something with weapons that its operators don’t want it to be doing, they will attempt to stop it. If they eventually succeed, then this doesn’t literally killeveryone, and the AI probably wasn’t the kind that can pose existential threat (even if it did cause a world-shaking disaster). If they can’t stop the AI, at all, even after trying for as long as they live, then it’s the kind of AI that would pose existential threat even without initially being handed access to weapons (if it wants weapons, it would be able to acquire them on its own). So the step of giving AI access to weapons is never a deciding factor for notkilleveryoneism, it’s only a deciding factor for preventing serious harm on a scale that’s smaller than that.
We should focus more on this immediate and concrete risk before the more abstract theories of alignment.
“Focus” suggests reallocation of a limited resource that becomes more scarce elsewhere as a result. I don’t think it’s a good thing to focus less on making sure that the outcome is not literally everyone dying than we are doing now. It’s possible to get to that point, where too much focus is on that, but I don’t think we are there.
Focus means spending time or energy on a task. Our time and energy is limited, and the danger of rogue AI is growing by the year. We should focus our energies on by forming an achievable goal, making a reasonable plan, and acting according to the plan.
Of course, there is a spectrum to the possible outcomes caused by a hypothetical rogue AI (rAI), ranging from insignificant to catastrophic. Any access the rAI might gain to human-made intelligent weapons would amplify the rAI’s power to cause real-world damage.
Of course, there is a spectrum to the possible outcomes
The problem is that with AI, facing existential risk eventually is a certainty, the capability of unbounded autonomous consequentialist agency is feasible to develop (humans have that level of capability, and humans are manifestly feasible, so AIs would merely need to be at least as capable). Either there is a way of mitigating that risk, or it killseveryone. At which point, no second chances. This is different from world-shaking disasters, which do allow second chances and also motivate trying to do better next time.
So this specifically is a natural threat level to consider on its own, not just as one of the points on a scale. And it’s arguably plausible in startlingly near future. And nobody has a reliable plan (or arguably any plan), including the people building the technology right now.
Yes, in the long term we will need a complete alignment strategy, such as permanent integration with our brains. However, before that happens, it would be prudent to limit the potential for a misaligned AI to cause permanent damage.
And, yes, we are in need of a more concrete plan and commitment from the people involved in the tech, especially with regards to lethal AI.
I’m thinking one or two years in the future is a plausible lower bound on time when a (technological) plan would need to be enacted to still have an effect on what happens eventually, or else in four years (from now) a killeveryone arrives (again, as an arguable lower bound, not as a median forecast).
Unless it’s fine by default, on its own, for reasons nobody reliably understands in advance, not because anyone had a plan. I think there is a good chance this is true, but betting the future of humanity on that is insane. Also, even if the first AGIs don’t killeveryone, they might fail to establish strong coordination that prevents other misaligned AGIs from getting built, which do killeveryone, including the first AGIs.
I think probably it’s more like 6 and 8 years, respectively, but that’s also not a lot of time to come up with a plan that depends on having fundamental science that’s not yet developed.
If an AI does something with weapons that its operators don’t want it to be doing, they will attempt to stop it. If they eventually succeed, then this doesn’t literally killeveryone, and the AI probably wasn’t the kind that can pose existential threat (even if it did cause a world-shaking disaster). If they can’t stop the AI, at all, even after trying for as long as they live, then it’s the kind of AI that would pose existential threat even without initially being handed access to weapons (if it wants weapons, it would be able to acquire them on its own). So the step of giving AI access to weapons is never a deciding factor for notkilleveryoneism, it’s only a deciding factor for preventing serious harm on a scale that’s smaller than that.
“Focus” suggests reallocation of a limited resource that becomes more scarce elsewhere as a result. I don’t think it’s a good thing to focus less on making sure that the outcome is not literally everyone dying than we are doing now. It’s possible to get to that point, where too much focus is on that, but I don’t think we are there.
Focus means spending time or energy on a task. Our time and energy is limited, and the danger of rogue AI is growing by the year. We should focus our energies on by forming an achievable goal, making a reasonable plan, and acting according to the plan.
Of course, there is a spectrum to the possible outcomes caused by a hypothetical rogue AI (rAI), ranging from insignificant to catastrophic. Any access the rAI might gain to human-made intelligent weapons would amplify the rAI’s power to cause real-world damage.
The problem is that with AI, facing existential risk eventually is a certainty, the capability of unbounded autonomous consequentialist agency is feasible to develop (humans have that level of capability, and humans are manifestly feasible, so AIs would merely need to be at least as capable). Either there is a way of mitigating that risk, or it killseveryone. At which point, no second chances. This is different from world-shaking disasters, which do allow second chances and also motivate trying to do better next time.
So this specifically is a natural threat level to consider on its own, not just as one of the points on a scale. And it’s arguably plausible in startlingly near future. And nobody has a reliable plan (or arguably any plan), including the people building the technology right now.
Yes, in the long term we will need a complete alignment strategy, such as permanent integration with our brains. However, before that happens, it would be prudent to limit the potential for a misaligned AI to cause permanent damage.
And, yes, we are in need of a more concrete plan and commitment from the people involved in the tech, especially with regards to lethal AI.
I’m thinking one or two years in the future is a plausible lower bound on time when a (technological) plan would need to be enacted to still have an effect on what happens eventually, or else in four years (from now) a killeveryone arrives (again, as an arguable lower bound, not as a median forecast).
Unless it’s fine by default, on its own, for reasons nobody reliably understands in advance, not because anyone had a plan. I think there is a good chance this is true, but betting the future of humanity on that is insane. Also, even if the first AGIs don’t killeveryone, they might fail to establish strong coordination that prevents other misaligned AGIs from getting built, which do killeveryone, including the first AGIs.
I think probably it’s more like 6 and 8 years, respectively, but that’s also not a lot of time to come up with a plan that depends on having fundamental science that’s not yet developed.
Best to slow down the development of AI in sensitive fields until we have a clearer understanding of its capabilities.