For instance, would they be in favour of programming an AI’s friendliness using random quantum bits, if it could be reassured that if friendliness fails, the AI would kill everyone immediately?
If you already have an is_friendly() predicate that will kill everyone if the AI isn’t friendly, why not make it just shut down the AI and try again? (If you don’t have such a predicate, I don’t know how you can guarantee the behavior of a random non-friendly AI)
I’ll pick door #2, I think...
If you already have an is_friendly() predicate that will kill everyone if the AI isn’t friendly, why not make it just shut down the AI and try again? (If you don’t have such a predicate, I don’t know how you can guarantee the behavior of a random non-friendly AI)