Clearly B has mistaken beliefs about either A or its own dispositions; otherwise B would not have dealt with A in the interaction where A ended up cheating. If B uses UDT (and hence will carry through punishments), and A uses any DT that correctly forecasts B’s response to cheating, then A should not in fact cheat. If A cheats anyway, though, B still punishes.
Actually, on further reflection, it’s possible that B would reason that it is logically impossible for A to have the specified dependency on B’s decision, and yet for A to still end up defecting, in which case even UDT might end up in trouble—it would be a transparent logical impossibility for A to defect if B’s beliefs about A are true, so it’s not clear that B would handle the event correctly. I’ll have to think about this.
If there is some probability of A cheating even if B precommits to punishment, but with odds in B’s favor, the situation where B needs to implement punishment is quite possible (expected). Likewise, if B precommiting to punish A is predicted to lead to an even worse outcome than not punishing (because of punishment expenses), UDT B won’t punish A. Futhermore, a probability of cheating and not-punishment of cheating (mixed strategies, possibly on logical uncertainty to defy the laws of the game if pure strategies are required) is a mechanism through which the players can (consensually) bargain with each other in the resulting parallel game, an issue Wei Dai mentioned in the other reply. B doesn’t need absolute certainty at any stage, in both cases.
Also, in UDT there are no logical certainties, as it doesn’t update on logical conclusions as well.
If there is some probability of A cheating even if B precommits to punishment
Sure, but that’s the convenient setup. What if for A to cheat means that you necessarily just mistaken about which algorithm A runs?
Also, in UDT there are no logical certainties, as it doesn’t update on logical conclusions as well.
UDT will be logically certain about some things but not others. If UDT B “doesn’t update” on its computation about what A will do in response to B, it’s going to be in trouble.
What if for A to cheat means that you necessarily just mistaken about which algorithm A runs?
A decision algorithm should never be mistaken, only uncertain.
UDT will be logically certain about some things but not others. If UDT B “doesn’t update” on its computation about what A will do in response to B, it’s going to be in trouble.
“Doesn’t update” doesn’t mean that it doesn’t use the info (but you know that, so what do you mean?). A logical conclusion can be a parameter in a strategy, without making the algorithm unable to reason about what it would be like if the conclusion was different, that is basically about uncertainty of same algorithm in other states of knowledge.
Clearly B has mistaken beliefs about either A or its own dispositions; otherwise B would not have dealt with A in the interaction where A ended up cheating. If B uses UDT (and hence will carry through punishments), and A uses any DT that correctly forecasts B’s response to cheating, then A should not in fact cheat. If A cheats anyway, though, B still punishes.
Actually, on further reflection, it’s possible that B would reason that it is logically impossible for A to have the specified dependency on B’s decision, and yet for A to still end up defecting, in which case even UDT might end up in trouble—it would be a transparent logical impossibility for A to defect if B’s beliefs about A are true, so it’s not clear that B would handle the event correctly. I’ll have to think about this.
If there is some probability of A cheating even if B precommits to punishment, but with odds in B’s favor, the situation where B needs to implement punishment is quite possible (expected). Likewise, if B precommiting to punish A is predicted to lead to an even worse outcome than not punishing (because of punishment expenses), UDT B won’t punish A. Futhermore, a probability of cheating and not-punishment of cheating (mixed strategies, possibly on logical uncertainty to defy the laws of the game if pure strategies are required) is a mechanism through which the players can (consensually) bargain with each other in the resulting parallel game, an issue Wei Dai mentioned in the other reply. B doesn’t need absolute certainty at any stage, in both cases.
Also, in UDT there are no logical certainties, as it doesn’t update on logical conclusions as well.
Sure, but that’s the convenient setup. What if for A to cheat means that you necessarily just mistaken about which algorithm A runs?
UDT will be logically certain about some things but not others. If UDT B “doesn’t update” on its computation about what A will do in response to B, it’s going to be in trouble.
A decision algorithm should never be mistaken, only uncertain.
“Doesn’t update” doesn’t mean that it doesn’t use the info (but you know that, so what do you mean?). A logical conclusion can be a parameter in a strategy, without making the algorithm unable to reason about what it would be like if the conclusion was different, that is basically about uncertainty of same algorithm in other states of knowledge.