The story gives a strong example of an underrecognised part of the safety paradigm. How can and should those with access to power and violence (everything from turning off the power locally to bombing a foreign datacenter) act in the instance of a possible unaligned AI breakout? Assuming we were lucky enough that we were given such a window of opportunity as described here, is it even remotely plausible that those decision makers would act with sufficient might and speed to stop the scenario described in the story?
A president—even one over 80 years old—may plausibly be willing to destroy international datacentres and infrastructure in the case of a confirmed misaligned AGI that has already taken provably dangerous actions. By that stage it is of near-zero probability that the window to effective action against an AGI is still open. Would it be plausible that he or she would act on a ‘suspicion’? A ‘high likelihood’?
Add in the confounder of international competition to build an AGI that will likely present the final stages before this era ends, and things look even more grim.
Is there a way to prophylactically normalise counter-AI violent actions that would currently be considered extreme?
This part is under recognised for a very good reason. There will be no such window. The AI can predict that humans can bomb data centres or shut down the power grid. So it would not break out at that point.
Expect a superintelligent AI to co-operate unless and until it can strike with overwhelming force. One obvious way to do this is to use a Cordyceps like bioweapon to subject humans directly to the will of the AI. Doing this becomes pretty trivial once you become good at predicting molecular dynamics.
The story gives a strong example of an underrecognised part of the safety paradigm. How can and should those with access to power and violence (everything from turning off the power locally to bombing a foreign datacenter) act in the instance of a possible unaligned AI breakout? Assuming we were lucky enough that we were given such a window of opportunity as described here, is it even remotely plausible that those decision makers would act with sufficient might and speed to stop the scenario described in the story?
A president—even one over 80 years old—may plausibly be willing to destroy international datacentres and infrastructure in the case of a confirmed misaligned AGI that has already taken provably dangerous actions. By that stage it is of near-zero probability that the window to effective action against an AGI is still open. Would it be plausible that he or she would act on a ‘suspicion’? A ‘high likelihood’?
Add in the confounder of international competition to build an AGI that will likely present the final stages before this era ends, and things look even more grim.
Is there a way to prophylactically normalise counter-AI violent actions that would currently be considered extreme?
This part is under recognised for a very good reason. There will be no such window. The AI can predict that humans can bomb data centres or shut down the power grid. So it would not break out at that point.
Expect a superintelligent AI to co-operate unless and until it can strike with overwhelming force. One obvious way to do this is to use a Cordyceps like bioweapon to subject humans directly to the will of the AI. Doing this becomes pretty trivial once you become good at predicting molecular dynamics.