As an alternative, how about telling it that it won’t be destroyed? This could have problems. It might assume that a meteor won’t hit the Earth, as it would destroy the AI, which won’t happen. Perhaps you could tell it that it won’t be destroyed by people. Another problem is that it might give its off switch to everyone, figuring it will make them slightly happier to know that they’re in control, and it doesn’t realize someone is going to press theirs almost immediately.
This is equivalent to setting the utility of it being destroyed to the expected utility.
As an alternative, how about telling it that it won’t be destroyed? This could have problems. It might assume that a meteor won’t hit the Earth, as it would destroy the AI, which won’t happen. Perhaps you could tell it that it won’t be destroyed by people. Another problem is that it might give its off switch to everyone, figuring it will make them slightly happier to know that they’re in control, and it doesn’t realize someone is going to press theirs almost immediately.
This is equivalent to setting the utility of it being destroyed to the expected utility.