Paul Christiano makes the case that if we use the universal prior to make important predictions, then we will end up assigning a large amount of probability mass to hypotheses which involve intelligent agents living in alternate universes who have thus-far deliberately made the correct predictions so that they might eventually manipulate us into doing what they want us to do. Paul calls these intelligent agents ‘consequentialists’.
I find ideas like this very difficult to think about clearly, but I have a strong gut-feeling that the argument is not correct. I’ve been unable to form a crisp formal argument against Paul’s proposal, but below I list a few weak reasons why the consequentialists’s probability mass in the universal prior might not be as high as Paul suggests.
Unnatural output channel: It is probably the case that in the vast majority of simple universes which ultimately spawn intelligent life, the most natural output channel is not accessible to its inhabitants. Paul gives an example of such an output channel in his post: in a cellular automata we could read data by sampling the state of the first non-zero cell. The most natural thing here would probably be to start sampling immediately from t=0. However, if the automata has simple rules and a simple starting state then it will take a very large number of time-steps before consequentialist life has had time to evolve to the point at which it can start to intentionally manipulate the output cell. As another example, take our own universe: if the ‘most natural’ output channel in our universe corresponds to a particular location then this probably isn’t inside our light-cone right now.
Unnatural input channel: Similar to natural output channels not necessarily being accessible, often it will also be impossible for a consequentialist to discern exactly what was fed in to her universe’s input channel. In the example of a cellular automata, the most natural input channel is probably the initial state. This is a problem for the automata’s inhabitants because, while knowing the state of the universe at a particular time lets you predict the next state, in general it won’t let you deduce exactly how old the universe is or what its initial conditions were. Another source of difficulty in recovering the data fed into your universe’s input channel is that if your universe implements something analogous to distance/velocity then it in many cases some information necessary to recover the data fed into your universe’s input channel might be moving away from you too fast for you to ever recover it (e.g. a space ship flying away from you at max speed in Conway’s game of life).
Implicit computational constraints: A complaint many people have with the universal prior is that it places no constraints on the amount of compute associated with a particular hypothesis, (meaning it allows absurd hypothesis like daemons in alternate universes). It is worth noticing that while there is no explicit computational penalty, daemons inside the prior are subject to implicit computational constraints. If the process which the alternate-universe consequentialists must use to predict the next observation we’re about to see requires a lot of compute, then from the consequentialist’s perspective this fact is not irrelevant. This is because (assuming they care about lots of things, not just controlling the universal prior) they will perceive the cost of the computation as a relevant expense which must be traded off against their other preferences, even though we don’t personally care how much compute power they use. These implicit computational costs can also further compromise the consequentialist’s access to their universe’s output channel. For example consider again a simple cellular automata such as Conway’s game of life. Conway’s game of life is Turing complete—it’s possible to compute an arbitrary sequence of bits (or simulate any compuatable universe) from within the game of life. However, I suspect it isn’t possible to compute an arbitrary sequence of bits such that this string can be read off by sampling a particular cell once every time-tick. In a similar vein, while you can indeed build Minecraft inside Minecraft, you can’t do it in such a way that the ‘real’ Minecraft world and the ‘simulated’ Minecraft world run at the same speed. So constraints relating to speed-of-computation further restrict the kinds of output channels the consequentialists are able to manipulate (and if targeting a particular output channel is very costly then they will have to trade-off between simplicity of the output channel and expense of reaching it).
I’m tempted to make further arguments about the unlikeliness that any particular consequentialist would especially care about manipulating our Solomonoff inductor more than any other Solomonoff inductor in the Tegmark IV multiverse, (even after conditioning on the importance of our decision and the language we use to measure complexity), but I don’t think I fully understand Paul’s idea of an anthropic update, so there’s a good chance this objection has already been addressed.
All these reasons don’t completely eliminate daemons from the universal prior, but I think they might reduce their probability mass to epistemically appropriate levels. I’ve relied extensively on the cellular automata case for examples and for driving my own intuitions, which might have lead me to overestimate the severity of some of the complexities listed above. These ideas are super weird and I find it very hard to think clearly and precisely about them so I could easily be mistaken, please point out any errors I’m making.
Weak arguments against the universal prior being malign
Paul Christiano makes the case that if we use the universal prior to make important predictions, then we will end up assigning a large amount of probability mass to hypotheses which involve intelligent agents living in alternate universes who have thus-far deliberately made the correct predictions so that they might eventually manipulate us into doing what they want us to do. Paul calls these intelligent agents ‘consequentialists’.
I find ideas like this very difficult to think about clearly, but I have a strong gut-feeling that the argument is not correct. I’ve been unable to form a crisp formal argument against Paul’s proposal, but below I list a few weak reasons why the consequentialists’s probability mass in the universal prior might not be as high as Paul suggests.
Unnatural output channel: It is probably the case that in the vast majority of simple universes which ultimately spawn intelligent life, the most natural output channel is not accessible to its inhabitants. Paul gives an example of such an output channel in his post: in a cellular automata we could read data by sampling the state of the first non-zero cell. The most natural thing here would probably be to start sampling immediately from t=0. However, if the automata has simple rules and a simple starting state then it will take a very large number of time-steps before consequentialist life has had time to evolve to the point at which it can start to intentionally manipulate the output cell. As another example, take our own universe: if the ‘most natural’ output channel in our universe corresponds to a particular location then this probably isn’t inside our light-cone right now.
Unnatural input channel: Similar to natural output channels not necessarily being accessible, often it will also be impossible for a consequentialist to discern exactly what was fed in to her universe’s input channel. In the example of a cellular automata, the most natural input channel is probably the initial state. This is a problem for the automata’s inhabitants because, while knowing the state of the universe at a particular time lets you predict the next state, in general it won’t let you deduce exactly how old the universe is or what its initial conditions were. Another source of difficulty in recovering the data fed into your universe’s input channel is that if your universe implements something analogous to distance/velocity then it in many cases some information necessary to recover the data fed into your universe’s input channel might be moving away from you too fast for you to ever recover it (e.g. a space ship flying away from you at max speed in Conway’s game of life).
Implicit computational constraints: A complaint many people have with the universal prior is that it places no constraints on the amount of compute associated with a particular hypothesis, (meaning it allows absurd hypothesis like daemons in alternate universes). It is worth noticing that while there is no explicit computational penalty, daemons inside the prior are subject to implicit computational constraints. If the process which the alternate-universe consequentialists must use to predict the next observation we’re about to see requires a lot of compute, then from the consequentialist’s perspective this fact is not irrelevant. This is because (assuming they care about lots of things, not just controlling the universal prior) they will perceive the cost of the computation as a relevant expense which must be traded off against their other preferences, even though we don’t personally care how much compute power they use. These implicit computational costs can also further compromise the consequentialist’s access to their universe’s output channel. For example consider again a simple cellular automata such as Conway’s game of life. Conway’s game of life is Turing complete—it’s possible to compute an arbitrary sequence of bits (or simulate any compuatable universe) from within the game of life. However, I suspect it isn’t possible to compute an arbitrary sequence of bits such that this string can be read off by sampling a particular cell once every time-tick. In a similar vein, while you can indeed build Minecraft inside Minecraft, you can’t do it in such a way that the ‘real’ Minecraft world and the ‘simulated’ Minecraft world run at the same speed. So constraints relating to speed-of-computation further restrict the kinds of output channels the consequentialists are able to manipulate (and if targeting a particular output channel is very costly then they will have to trade-off between simplicity of the output channel and expense of reaching it).
I’m tempted to make further arguments about the unlikeliness that any particular consequentialist would especially care about manipulating our Solomonoff inductor more than any other Solomonoff inductor in the Tegmark IV multiverse, (even after conditioning on the importance of our decision and the language we use to measure complexity), but I don’t think I fully understand Paul’s idea of an anthropic update, so there’s a good chance this objection has already been addressed.
All these reasons don’t completely eliminate daemons from the universal prior, but I think they might reduce their probability mass to epistemically appropriate levels. I’ve relied extensively on the cellular automata case for examples and for driving my own intuitions, which might have lead me to overestimate the severity of some of the complexities listed above. These ideas are super weird and I find it very hard to think clearly and precisely about them so I could easily be mistaken, please point out any errors I’m making.