Imagine an agent with an (incorrect) belief that only by killing everyone, would the world be the best place possible, and a prior against anything realistically causing it to update away. This would have to be stopped somehow, because of what it thinks (and what that causes it to do).
That doesn’t quite follow.
Thinking something does not make it so, and there are a vanishingly small number of people who could realistically act on a desire to kill everyone. The only time you have to be deeply concerned about someone with those beliefs is if they managed to end up in a position of power, and even that just means “stricter controls on who gets access to world-ending power” rather than searching for thoughtcriminals specifically.
That doesn’t quite follow.
Thinking something does not make it so, and there are a vanishingly small number of people who could realistically act on a desire to kill everyone. The only time you have to be deeply concerned about someone with those beliefs is if they managed to end up in a position of power, and even that just means “stricter controls on who gets access to world-ending power” rather than searching for thoughtcriminals specifically.