Minor correction
But then, in the small fraction of worlds where we survive, we simulate lots and lots of copies of that AI where it instead gets reward 0 when it attempts to betray us!
The reward should be negative rather than 0.
Minor correction
The reward should be negative rather than 0.