Well, from the perspective of the AI, it’s behaving perfectly rationally. It finds the highest-probability thing that could give it infinite reward, and then prepares for that, no matter how small the probability is. It only seems strange to us humans because (1) we’re Allais-ey, and (2) it is a clear case of logical, one-shot probability, which is less intuitive.
If our AI models the world with one set of laws at a time, rather than having a probability distribution over laws, then this behavior could pop up as a surprise.
Well, from the perspective of the AI, it’s behaving perfectly rationally. It finds the highest-probability thing that could give it infinite reward, and then prepares for that, no matter how small the probability is. It only seems strange to us humans because (1) we’re Allais-ey, and (2) it is a clear case of logical, one-shot probability, which is less intuitive.
If our AI models the world with one set of laws at a time, rather than having a probability distribution over laws, then this behavior could pop up as a surprise.
Precisely. That’s all I was saying.