I think it largely comes down to how you handle divergent resources. For the ultraviolet catastrophe, let’s use the example of… the ultraviolet catastrophe.
Let’s suppose that the AI had a use for materials that emitted infinite power in thermal radiation. In fact, as the power emitted went up, the usefulness went up without bound. Photonic rocket engines for exploring the stars, perhaps, or how fast you could loop a computational equivalent of a paper clip being produced.
Now, the AI knows that the ultraviolet catastrophe doesn’t actually occur, with very good certainty. But it could get Pascal’s wagered here—it takes actions weighted both by the probability, and by the impact the action could have. So it assigns a divergent weight to actions that benefit divergently from the ultraviolet catastrophe, and builds a infinite-power computer that it knows won’t work.
So it assigns a divergent weight to actions that benefit divergently from the ultraviolet catastrophe, and builds a infinite-power computer that it knows won’t work.
How is this different to accepting a bet it “knows” it will lose? We may know with certainty that it doesn’t live in a classical universe, because we specified the problem, but the AI doesn’t.
Well, from the perspective of the AI, it’s behaving perfectly rationally. It finds the highest-probability thing that could give it infinite reward, and then prepares for that, no matter how small the probability is. It only seems strange to us humans because (1) we’re Allais-ey, and (2) it is a clear case of logical, one-shot probability, which is less intuitive.
If our AI models the world with one set of laws at a time, rather than having a probability distribution over laws, then this behavior could pop up as a surprise.
Hm.
I think it largely comes down to how you handle divergent resources. For the ultraviolet catastrophe, let’s use the example of… the ultraviolet catastrophe.
Let’s suppose that the AI had a use for materials that emitted infinite power in thermal radiation. In fact, as the power emitted went up, the usefulness went up without bound. Photonic rocket engines for exploring the stars, perhaps, or how fast you could loop a computational equivalent of a paper clip being produced.
Now, the AI knows that the ultraviolet catastrophe doesn’t actually occur, with very good certainty. But it could get Pascal’s wagered here—it takes actions weighted both by the probability, and by the impact the action could have. So it assigns a divergent weight to actions that benefit divergently from the ultraviolet catastrophe, and builds a infinite-power computer that it knows won’t work.
How is this different to accepting a bet it “knows” it will lose? We may know with certainty that it doesn’t live in a classical universe, because we specified the problem, but the AI doesn’t.
Well, from the perspective of the AI, it’s behaving perfectly rationally. It finds the highest-probability thing that could give it infinite reward, and then prepares for that, no matter how small the probability is. It only seems strange to us humans because (1) we’re Allais-ey, and (2) it is a clear case of logical, one-shot probability, which is less intuitive.
If our AI models the world with one set of laws at a time, rather than having a probability distribution over laws, then this behavior could pop up as a surprise.
Precisely. That’s all I was saying.