Jeremy Gillen comments on Am I confused about the “malign universal prior” argument?

Jeremy Gillen 30 Aug 2024 13:07 UTC
8 points
0
Great explanation, you have found the crux. I didn’t know such problems were called singular perturbation problems.
If I thought that reasoning about the UP was definitely a singular perturbation problem in the relevant sense, then I would agree with you (that the malign prior argument doesn’t really work). I think it’s probably not, but I’m not extremely confident.
Your argument that it is a singular perturbation problem is that it involves self reference. I agree that self-reference is kinda special and can make it difficult to formally model things, but I will argue that it is often reasonable to just treat the inner approximations as exact.
The reason is: Problems that involve self reference are often easy to approximate by using more coarse-grained models as you move deeper.
One example as an intuition pump is an MCTS chess bot. In order to find a good move, it needs to think about its opponent thinking about itself, etc. We can’t compute this (because its exponential, not because its non-computable), but if we approximate the deeper layers by pretending they move randomly (!), it works quite well. Having a better move distribution works even better.
Maybe you’ll object that this example isn’t precisely self-reference. But the same algorithm (usually) works for finding a nash equilibria on simultaneous move games, which do involve infinitely deep self reference.
And another more general way of doing essentially the same thing is using a reflective oracle. Which I believe can also be used to describe a UP that can contain infinitely deep self-reference (see the last paragraph of the conclusion).^[1] I think the fact that Paul worked on this suggests that he did see the potential issues with self-reference and wanted better ways to reason formally about such systems.
To be clear, I don’t think any of these examples tells us that the problem is definitely a regular perturbation problem. But I think these examples do suggest that assuming that it is regular is a very reasonable place to start, and probably tells us a lot about similar, more realistic, systems.
On the gap between the computable and uncomputable: It’s not so bad to trifle a little. Diagonalization arguments can often be avoided with small changes to the setup, and a few of Paul’s papers are about doing exactly this.
And the same argument works for a computable prior. E.g. we could make a prior over a finite set of total turing machines, such that it still contained universes with clever agents.
Why would we imagine that a single TM can approximate this arbitrarily well?
If I remember correctly, a single TM definitely can’t approximate it arbitrarily well. But my argument doesn’t depend on this.
1. ^
  Don’t trust me on this though, my understanding of reflective oracles is very limited.
- LGS 30 Aug 2024 20:55 UTC
  5 points
  0
  Parent
  Thanks for the link to reflective oracles!
  On the gap between the computable and uncomputable: It’s not so bad to trifle a little. Diagonalization arguments can often be avoided with small changes to the setup, and a few of Paul’s papers are about doing exactly this.
  
  I strongly disagree with this: diagonalization arguments often cannot be avoided at all, not matter how you change the setup. This is what vexed logicians in the early 20th century: no matter how you change your formal system, you won’t be able to avoid Godel’s incompleteness theorems.
  There is a trick that reliably gets you out of such paradoxes, however: switch to probabilistic mixtures. This is easily seen in a game setting: in rock-paper-scissors, there is no deterministic Nash equilibrium. Switch to mixed strategies, however, and suddenly there is always a Nash equilibrium.
  This is the trick that Paul is using: he is switching from deterministic Turing machines to randomized ones. That’s fine as far as it goes, but it has some weird side effects. One of them is that if a civilization is trying to predict the universal prior that is simulating itself, and tries to send a message, then it is likely that with “reflexive oracles” in place, the only message it can send is random noise. That is, Paul shows reflexive oracles exist in the same way that Nash equilibria exist; but there is no control over what the reflexive oracle actually is, and in paradoxical situations (like rock-paper-scissors) the Nash equilibrium is the boring “mix everything together uniformly”.
  The underlying issue is that a universe that can predict the universal prior, which in turn simulates the universe itself, can encounter a grandfather paradox. It can see its own future by looking at the simulation, and then it can do the opposite. The grandfather paradox is where the universe decides to kill the grandfather of a child that the simulation predicts.
  Paul solves this by only letting it see its own future using a “reflexive oracle” which essentially finds a fixed point (which is a probability distribution). The fixed point of a grandfather paradox is something like “half the time the simulation shows the grandchild alive, causing the real universe to kill the grandfather; the other half the time, the simulation shows the grandfather dead and the grandchild not existing”. Such a fixed point exists even when the universe tries to do the opposite of the prediction.
  The thing is, this fixed point is boring! Repeat this enough times, and it eventually just says “well my prediction about your future is random noise that doesn’t have to actually come true in your own future”. I suspect that if you tried to send a message through the universal prior in this setting, the message would consist of essentially uniformly random bits. This would depend on the details of the setup, I guess.
  - Jeremy Gillen 30 Aug 2024 21:08 UTC
    2 points
    0
    Parent
    I strongly disagree with this: diagonalization arguments often cannot be avoided at all, not matter how you change the setup. …
    There is a trick that reliably gets you out of such paradoxes, however: switch to probabilistic mixtures.
    Fair enough, the probabilistic mixtures thing was what I was thinking of as a change of setup, but reasonable to not consider it such.
    the message would consist of essentially uniformly random bits
    I don’t see how this is implied. If a fact is consistent across levels, and determined in a non-paradoxical way, can’t this become a natural fixed point that can be “transmitted” across levels? And isn’t this kind of knowledge all that is required for the malign prior argument to work?
    - LGS 30 Aug 2024 21:21 UTC
      3 points
      0
      Parent
      The problem is that the act of leaving the message depends on the output of the oracle (otherwise you wouldn’t need the oracle at all, but you also would not know how to leave a message). If the behavior of the machine depends on the oracle’s actions, then we have to be careful with what the fixed point will be.
      
      For example, if we try to fight the oracle and do the opposite, we get the “noise” situation from the grandfather paradox.
      
      But if we try to cooperate with the oracle and do what it predicts, then there are many different fixed points and no telling which the oracle would choose (this is not specified in the setting).
      
      It would be great to see a formal model of the situation. I think any model in which such message transmission would work is likely to require some heroic assumptions which don’t correspond much to real life.
      - Jeremy Gillen 30 Aug 2024 23:00 UTC
        2 points
        0
        Parent
        If the only transmissible message is essentially uniformly random bits, then of what value is the oracle?
        I claim the message can contain lots of information. E.g. if there are 2^100 potential actions, but only 2 fixed points, then 99 bits have been transmitted (relative to uniform).
        The rock-paper-scissors example is relatively special, in that the oracle can’t narrow down the space of actions at all.
        The UP situation looks to me to be more like the first situation than the second.
        LGS 31 Aug 2024 1:48 UTC
        3 points
        0
        Parent
        It would help to have a more formal model, but as far as I can tell the oracle can only narrow down its predictions of the future to the extent that those predictions are independent of the oracle’s output. That is to say, if the people in the universe ignore what the oracle says, then the oracle can give an informative prediction.
        This would seem to exactly rule out any type of signal which depends on the oracle’s output, which is precisely the types of signals that nostalgebraist was concerned about.
        Jeremy Gillen 31 Aug 2024 18:08 UTC
        4 points
        2
        Parent
        That can’t be right in general. Normal nash equilibria can narrow down predictions of actions. E.g. competition game. This is despite each player’s decision being dependent on the other player’s action.
        LGS 31 Aug 2024 19:17 UTC
        3 points
        0
        Parent
        That’s fair, yeah
        We need a proper mathematical model to study this further. I expect it to be difficult to set up because the situation is so unrealistic/impossible as to be hard to model. But if you do have a model in mind I’ll take a look