A genre of win-conditions that I think this post doesn’t put enough weight on:
The AIs somewhere in-between TAI and superintelligence (AIs that are controllable and/or not egregiously misaligned) generate convincing evidence of risks from racing to superintelligence. If people actually believed these risks, they would act more cautiously, modulo “I’ll do it safer than they will”. People don’t currently believe in the risks much and IMO this is pretty counterfactual to the amount they’re currently mitigating risks.
AI Control helps a bit here by allowing us to get more e.g. model organisms work done at the relevant time (though my point pushes more strongly in favor of other directions of work).
Note also that it will probably be easier to act cautiously if you don’t have to be constantly in negotiations with an escaped scheming AI that is currently working on becoming more powerful, perhaps attacking you with bioweapons, etc!
Hmm, when I imagine “Scheming AI that is not easy to shut down with concerted nation-state effort, are attacking you with bioweapons, but are weak enough such that you can bargain/negotiate with them” I can imagine this outcome inspiring a lot more caution relative to many other worlds where control techniques work well but we can’t get any convincing demos/evidence to inspire caution (especially if control techniques inspire overconfidence).
But the ‘is currently working on becoming more powerful’ part of your statement does carry a lot of weight.
A genre of win-conditions that I think this post doesn’t put enough weight on:
The AIs somewhere in-between TAI and superintelligence (AIs that are controllable and/or not egregiously misaligned) generate convincing evidence of risks from racing to superintelligence. If people actually believed these risks, they would act more cautiously, modulo “I’ll do it safer than they will”. People don’t currently believe in the risks much and IMO this is pretty counterfactual to the amount they’re currently mitigating risks.
AI Control helps a bit here by allowing us to get more e.g. model organisms work done at the relevant time (though my point pushes more strongly in favor of other directions of work).
Note also that it will probably be easier to act cautiously if you don’t have to be constantly in negotiations with an escaped scheming AI that is currently working on becoming more powerful, perhaps attacking you with bioweapons, etc!
Hmm, when I imagine “Scheming AI that is not easy to shut down with concerted nation-state effort, are attacking you with bioweapons, but are weak enough such that you can bargain/negotiate with them” I can imagine this outcome inspiring a lot more caution relative to many other worlds where control techniques work well but we can’t get any convincing demos/evidence to inspire caution (especially if control techniques inspire overconfidence).
But the ‘is currently working on becoming more powerful’ part of your statement does carry a lot of weight.
People will sure be scared of AI, but the arms race pressure will be very strong, and I think that is a bigger consideration