Wei Dai comments on Conceptual Problems with UDT and Policy Selection

Wei Dai 1 Jul 2019 13:50 UTC
2 points
Suppose you (as a human) are playing chicken against this version of UDT, which has vastly more computing power than you and could simulate your decisions in its proofs. Would you swerve?

I wouldn’t, because I would reason that if I didn’t swerve, UDT would simulate that and conclude that not swerving leads to the highest utility. You said “By deliberately crashing into the formerly smart madman, UDT can retroactively erase the situation.” but I don’t see how this version of UDT does that.

I don’t know what logical updatelessness means, and I don’t see where the article describes this

You’re right, the post kind of just obliquely mentions it and assumes the reader already knows the concept, in this paragraph:

Both agents race to decide how to decide first. Each strives to understand the other agent’s behavior as a function of its own, to select the best policy for dealing with the other. Yet, such examination of the other needs to itself be done in an updateless way. It’s a race to make the most uninformed decision.

Not sure what’s a good reference for logical updatelessness. Maybe try some of these posts? The basic idea is just that even if you manage to prove that your opponent doesn’t swerve, you perhaps shouldn’t “update” on that and then make your own decision while assuming that as a fixed fact that can’t be changed.
- Gurkenglas 1 Jul 2019 14:24 UTC
  1 point
  Parent
  If I didn’t assume PA is consistent, I would swerve because I wouldn’t know whether UDT might falsely prove that I swerve. Since PA is consistent and I assume this, I am in fact better at predicting UDT than UDT is at predicting itself, and it swerves while I don’t. Can you find a strategy that beats UDT, doesn’t disentangle its opponent from the environment, swerves against itself and “doesn’t assume UDT’s proof system is consistent”?
  It sounds like you mentioned logical updatelessness because my version of UDT does not trust a proof of “u = …”, it wants the whole set of proofs of “u >= …”. I’m not yet convinced that there are any other proofs it must not trust.