It seems to me that an anti-UFAI that does not also prevent the creation of FAIs would, necessarily, be just as hard to make as an FAI. Identifying an FAI without having a sufficiently good model of what one is that you could make one seems implausible.
An anti-UFAI could have terms like ‘minimal collateral damage’ in it’s motivation that would cause it to prioritize stopping faster or more destructive AIs over slower or friendlier ones, voluntarily limit it’s own growth, accept ongoing human supervision, and cleanly self-destruct under appropriate circumstances.
An FAI is expected to make the world better, not just keep it from getting worse, and as such would need to be trusted with far more autonomy and long-term stability.
It seems to me that an anti-UFAI that does not also prevent the creation of FAIs would, necessarily, be just as hard to make as an FAI. Identifying an FAI without having a sufficiently good model of what one is that you could make one seems implausible.
Am I wrong?
You’re at least plausible.
An anti-UFAI could have terms like ‘minimal collateral damage’ in it’s motivation that would cause it to prioritize stopping faster or more destructive AIs over slower or friendlier ones, voluntarily limit it’s own growth, accept ongoing human supervision, and cleanly self-destruct under appropriate circumstances.
An FAI is expected to make the world better, not just keep it from getting worse, and as such would need to be trusted with far more autonomy and long-term stability.