It’s not obvious to me why the bolded assertion follows; isn’t the point of “updatelessness” precisely that you ignore / refrain from conditioning your decision on (negative-sum) actions taken by your opponent in a way that would, if your conditioning on those actions was known in advance, predictably incentivize your opponent to take those actions? Isn’t that the whole point of having a decision theory that doesn’t give in to blackmail?
By “has to” I didn’t mean that’s normatively the right thing to do, but rather that’s what UDT (as currently formulated) says to do. UDT is (currently) updateless with regard to physical observations (inputs from your sensors) but not logical observations (things that you compute in your mind), and nobody seems to know how to formulate a decision theory that is logically updateless (and not broken in other ways). It seems to be a hard problem as progress has been bogged down for more than 10 years.
By “has to” I didn’t mean that’s normatively the right thing to do, but rather that’s what UDT (as currently formulated) says to do. UDT is (currently) updateless with regard to physical observations (inputs from your sensors) but not logical observations (things that you compute in your mind), and nobody seems to know how to formulate a decision theory that is logically updateless (and not broken in other ways). It seems to be a hard problem as progress has been bogged down for more than 10 years.
Conceptual Problems with UDT and Policy Selection is probably the best article to read to get up to date on this issue, if you want a longer answer.