That’s my understanding of why it’s bad, yes. The point of the button is that we want to be able to choose whether it gets pressed or not. If the AI presses it in a bunch of world where we don’t want it pressed and stops it from being pressed in a bunch of worlds where we do want it pressed, those are both bad. The fact that the AI is trading an equal probability mass in both directions doesn’t make it any less bad from our perspective.
That’s my understanding of why it’s bad, yes. The point of the button is that we want to be able to choose whether it gets pressed or not. If the AI presses it in a bunch of world where we don’t want it pressed and stops it from being pressed in a bunch of worlds where we do want it pressed, those are both bad. The fact that the AI is trading an equal probability mass in both directions doesn’t make it any less bad from our perspective.