Failsafe measures are a great idea. They just don’t do anything to privilege CEV + failsafe over anything_else + failsafe.
Yes. They make sure that [CEV + failsafe] is not worse than not running any AIs. Uncertainty about whether CEV works makes expected [CEV + failsafe] significantly better than doing nothing. Presence of potential controlled shutdown scenarios doesn’t argue for worthlessness of the attempt, even where detailed awareness of these scenarios could be used to improve the plan.
“Not running it” does make [CEV + failsafe] desirable, as compared to doing nothing, even in the face of problems with [CEV], and nobody is going to run just [CEV]. So most arguments for presence of problems in CEV, if they are met with adequate failsafe specifications (which is far from a template to reply to anything, failsafes are not easy), do indeed lose a lot of traction. Besides, what are they arguments for? One needs a suggestion for improvement, and failsafes are intended to make it so that doing nothing is not an improvement, even though improvements over any given state of the plan would be dandy.
Yes. They make sure that [CEV + failsafe] is not worse than not running any AIs. Uncertainty about whether CEV works makes expected [CEV + failsafe] significantly better than doing nothing. Presence of potential controlled shutdown scenarios doesn’t argue for worthlessness of the attempt, even where detailed awareness of these scenarios could be used to improve the plan.
I’m actually not even sure whether you are trying to disagree with me or not but once again, in case you are, nothing here weakens my position.
“Not running it” does make [CEV + failsafe] desirable, as compared to doing nothing, even in the face of problems with [CEV], and nobody is going to run just [CEV]. So most arguments for presence of problems in CEV, if they are met with adequate failsafe specifications (which is far from a template to reply to anything, failsafes are not easy), do indeed lose a lot of traction. Besides, what are they arguments for? One needs a suggestion for improvement, and failsafes are intended to make it so that doing nothing is not an improvement, even though improvements over any given state of the plan would be dandy.
Yes, this is trivially true and not currently disputed by anyone here. Nobody is suggesting doing nothing. Doing nothing is crazy.