Any given FAI design can turn out to be unable to do the right thing, which corresponds to tripping failsafes, but to be a FAI it must also be potentially capable (for all we know) of doing the right thing. Adequate failsafe should just turn off an ordinary AGI immediately, so it won’t work as an AI-in-chains FAI solution. You can’t make AI do the right thing just by adding failsafes, you also need to have a chance of winning.
Any given FAI design can turn out to be unable to do the right thing, which corresponds to tripping failsafes, but to be a FAI it must also be potentially capable (for all we know) of doing the right thing. Adequate failsafe should just turn off an ordinary AGI immediately, so it won’t work as an AI-in-chains FAI solution. You can’t make AI do the right thing just by adding failsafes, you also need to have a chance of winning.
Affirmed.