Caspar Oesterheld comments on CDT=EDT=UDT

Caspar Oesterheld Jan 16, 2019, 11:50 PM
LW: 10 AF: 6
AF
> I tried to understand Caspar’s EDT+SSA but was unable to figure it out. Can someone show how to apply it to an example like the AMD to help illustrate it?

Sorry about that! I’ll try to explain it some more. Let’s take the original AMD. Here, the agent only faces a single type of choice—whether to EXIT or CONTINUE. Hence, in place of a policy we can just condition on $p$ when computing our SSA probabilities. Now, when using EDT+SSA, we assign probabilities to being a specific instance in a specific possible history of the world. For example, we assign probabilities of the form $P_{S S A} (X in X Y B ∣ p)$ , which denotes the probability that given I choose to CONTINUE with probability $p$ , history $X Y B$ (a.k.a. CONTINUE, EXIT) is actual and that I am the instance intersection $X$ (i.e., the first intersection). Since we’re using SSA, these probabilities are computed as follows:
$P_{S S A} (X in X Y B ∣ p) = P (X Y B ∣ p) \cdot P_{S S A} (X ∣ X Y B) = P (X Y B ∣ p) \cdot \frac{1}{2} .$
That is, we first compute the probability that the history itself is actual (given $p$ ). Then we multiply it by the probability that within that history I am the instance at $X$ , which is just 1 divided by the number of instances of myself in that history, i.e. 2.
Now, the expected value according to EDT + SSA given $p$ can be computed by just summing over all possible situations, i.e. over all combinations of a history and a position within that history and multiplying the probability of that situation with the utility given that situation:
$P_{S S A} (X in X Y B ∣ p) \cdot 4 + P_{S S A} (X in X Y C ∣ p) \cdot 1 + P_{S S A} (X in X A ∣ p) \cdot 0 + P_{S S A} (Y in X Y B ∣ p) \cdot 4 + P_{S S A} (Y in X Y C ∣ p) \cdot 1 = \frac{1}{2} P_{S S A} (X Y B ∣ p) \cdot 4 + \frac{1}{2} P_{S S A} (X Y C ∣ p) \cdot 1 + \frac{1}{2} P_{S S A} (X Y B ∣ p) \cdot 4 + \frac{1}{2} P_{S S A} (X Y C ∣ p) \cdot 1 = P_{S S A} (X Y B ∣ p) \cdot 4 + P_{S S A} (X Y C ∣ p) \cdot 1$
And that’s exactly the ex ante expected value (or UDT-expected value, I suppose) of continuing with probability $p$ . Hence, EDT+SSA’s recommendation in AMD is the ex ante optimal policy (or UDT’s recommendation, I suppose). This realization is not original to myself (though I came up with it independently in collaboration with Johannes Treutlein) -- the following papers make the same point:
- Rachael Briggs (2010): Putting a value on Beauty. In Tamar Szabo Gendler and John Hawthorne, editors, Oxford Studies in Epistemology: Volume 3, pages 3–34. Oxford University Press, 2010. http://joelvelasco.net/teaching/3865/briggs10-puttingavalueonbeauty.pdf
- Wolfgang Schwarz (2015): Lost memories and useless coins: revisiting the absentminded driver. In: Synthese. https://www.umsu.de/papers/driver-2011.pdf
My comment generalizes these results a bit to include cases in which the agent faces multiple different decisions.
What links here?
- Caspar Oesterheld's comment on In memoryless Cartesian environments, every UDT policy is a CDT+SIA policy by jessicata (Jan 16, 2019, 11:54 PM; 2 points)
- Wei Dai Jan 17, 2019, 8:07 AM
  LW: 6 AF: 3
  AF Parent
  Thanks, I think I understand now, and made some observations about EDT+SSA at the old thread. At this point I’d say this quote from the OP is clearly wrong:
  
  So, we could say that CDT+SIA = EDT+SSA = UDT1.0; or, CDT=EDT=UDT for short.
  
  In fact UDT1.0 > EDT+SSA > CDT+SIA, because CDT+SIA is not even able to coordinate agents making the same observation, while EDT+SSA can do that but not coordinate agents making different observations, and UDT1.0 can (probably) coordinate agents making different observations (but seemingly at least some of them require UDT1.1 to coordinate).
  - abramdemski Jan 19, 2019, 8:22 PM
    LW: 2 AF: 1
    AF Parent
    Agreed. I’ll at least edit the post to point to this comment.