The problem with this argument is that the oracle sucks.
The humans believe they have access to an oracle that correctly predicts what happens in the real world. However, they have access to a defective oracle which only performs well in simulated worlds, but performs terribly in the “real” universe (more generally, universes in which humans are real). This is a pretty big problem with the oracle!
Yes, I agree that an oracle which is incentivized to make correct predictions within its own vantage point (including possible simulated worlds, not restricted to the real world) is malign. I don’t really agree the Solomonoff prior has this incentive. I also don’t think this is too relevant to any superintelligence we might encounter in the real world, since it is unlikely that it will have this specific incentive (this is for a variety of reasons, including “the oracle will probably care about the real world more” and, more importantly, “the oracle has no incentive to say its true beliefs anyway”).
The problem with this argument is that the oracle sucks.
The humans believe they have access to an oracle that correctly predicts what happens in the real world. However, they have access to a defective oracle which only performs well in simulated worlds, but performs terribly in the “real” universe (more generally, universes in which humans are real). This is a pretty big problem with the oracle!
Yes, I agree that an oracle which is incentivized to make correct predictions within its own vantage point (including possible simulated worlds, not restricted to the real world) is malign. I don’t really agree the Solomonoff prior has this incentive. I also don’t think this is too relevant to any superintelligence we might encounter in the real world, since it is unlikely that it will have this specific incentive (this is for a variety of reasons, including “the oracle will probably care about the real world more” and, more importantly, “the oracle has no incentive to say its true beliefs anyway”).