If it’s true that simulating that universe is the simplest way to predict our human, then some non-trivial fraction of our prediction might be controlled by a simulation in another universe. If these beings want us to act in certain ways, they have an incentive to alter their simulation to change our predictions.
I find this confusing. I’m not saying it’s wrong, necessarily, but it at least feels to me like there’s a step of the argument that’s being skipped.
To me, it seems like there’s a basic dichotomy between predicting and controlling. And this is claiming that somehow an agent somewhere is doing both. (Or actually, controlling by predicting!) But how, exactly?
Is it that:
these other agents are predicting us, by simulating us, and so we should think of ourselves as partially existing in their universe? (with them as our godlike overlords who can continue the simulation from the current point as they wish)
the Consequentialists will predict accurately for a while, and then make a classic “treacherous turn” where they start slipping in wrong predictions designed to influence us rather than be accurate, after having gained our trust?
something else?
My guess is that it’s the second thing (in part from having read, and very partially understood, Paul’s posts on this a while ago). But then I would expect some discussion of the “treacherous turn” aspect of it—of the fact that they have to predict accurately for a while (so that we rate them highly in our ensemble of programs), and only then can they start outputting predictions that manipulate us.
Is that not the case? Have I misunderstood something?
(Btw, I found the stuff about python^10 and exec() pretty clear. I liked those examples. Thank you! It was just from this point on in the post that I wasn’t quite sure what to make of it.)
My understanding is the first thing is what you get with UDASSA and the second thing would be what you get is if you think the Solomonoff prior is useful for predicting your universe for some other reason (ie not because you think the likelihood of finding yourself in some situation covaries with the Solomonoff prior’s weight on that situation)
I find this confusing. I’m not saying it’s wrong, necessarily, but it at least feels to me like there’s a step of the argument that’s being skipped.
To me, it seems like there’s a basic dichotomy between predicting and controlling. And this is claiming that somehow an agent somewhere is doing both. (Or actually, controlling by predicting!) But how, exactly?
Is it that:
these other agents are predicting us, by simulating us, and so we should think of ourselves as partially existing in their universe? (with them as our godlike overlords who can continue the simulation from the current point as they wish)
the Consequentialists will predict accurately for a while, and then make a classic “treacherous turn” where they start slipping in wrong predictions designed to influence us rather than be accurate, after having gained our trust?
something else?
My guess is that it’s the second thing (in part from having read, and very partially understood, Paul’s posts on this a while ago). But then I would expect some discussion of the “treacherous turn” aspect of it—of the fact that they have to predict accurately for a while (so that we rate them highly in our ensemble of programs), and only then can they start outputting predictions that manipulate us.
Is that not the case? Have I misunderstood something?
(Btw, I found the stuff about python^10 and exec() pretty clear. I liked those examples. Thank you! It was just from this point on in the post that I wasn’t quite sure what to make of it.)
My understanding is the first thing is what you get with UDASSA and the second thing would be what you get is if you think the Solomonoff prior is useful for predicting your universe for some other reason (ie not because you think the likelihood of finding yourself in some situation covaries with the Solomonoff prior’s weight on that situation)