Thanks for your answer! This “gain” approach seems quite similar to what Wedgwood (2013) has proposed as “Benchmark Theory”, which behaves like CDT in cases with, but more like EDT in cases without causally dominant actions. My hunch would be that one might be able to construct a series of thought-experiments in which such a theory violates transitivity of preference, as demonstrated by Ahmed (2012).
I don’t understand how you arrive at a gain of 0 for not smoking as a smoke-lover in my example. I would think the gain for not smoking is higher:
So as long as P(S1|a2)<0.8, the gain of not smoking is actually higher than that of smoking. For example, given prior probabilities of 0.5 for either state, the equilibrium probability of being a smoke-lover given not smoking will be 0.5 at most (in the case in which none of the smoke-lovers smoke).
Ah, you’re right. So gain doesn’t achieve as much as I thought it did. Thanks for the references, though. I think the idea is also similar in spirit to a proposal of Jeffrey’s in him book The Logic of Decision; he presents an evidential theory, but is as troubled by cooperating in prisoner’s dilemma and one-boxing in Newcomb’s problem as other decision theorists. So, he suggests that a rational agent should prefer actions such that, having updated on probably taking that action rather than another, you still prefer that action. (I don’t remember what he proposed for cases when no such action is available.) This has a similar structure of first updating on a potential action and then checking how alternatives look from that position.
Thanks for your answer! This “gain” approach seems quite similar to what Wedgwood (2013) has proposed as “Benchmark Theory”, which behaves like CDT in cases with, but more like EDT in cases without causally dominant actions. My hunch would be that one might be able to construct a series of thought-experiments in which such a theory violates transitivity of preference, as demonstrated by Ahmed (2012).
I don’t understand how you arrive at a gain of 0 for not smoking as a smoke-lover in my example. I would think the gain for not smoking is higher:
Gain(a2)=E[U|a2]−E[U|a2,do(a1)]=P(S1|a2)⋅U(S1∧a2)+P(S2|a2)⋅U(S2∧a2)−P(S1|a2)⋅U(S1∧a1)−P(S2|a2)⋅U(S2∧a1)
=P(S1|a2)⋅−10+P(S2|a2)⋅90=P(S1|a2)⋅−100+90.
So as long as P(S1|a2)<0.8, the gain of not smoking is actually higher than that of smoking. For example, given prior probabilities of 0.5 for either state, the equilibrium probability of being a smoke-lover given not smoking will be 0.5 at most (in the case in which none of the smoke-lovers smoke).
Ah, you’re right. So gain doesn’t achieve as much as I thought it did. Thanks for the references, though. I think the idea is also similar in spirit to a proposal of Jeffrey’s in him book The Logic of Decision; he presents an evidential theory, but is as troubled by cooperating in prisoner’s dilemma and one-boxing in Newcomb’s problem as other decision theorists. So, he suggests that a rational agent should prefer actions such that, having updated on probably taking that action rather than another, you still prefer that action. (I don’t remember what he proposed for cases when no such action is available.) This has a similar structure of first updating on a potential action and then checking how alternatives look from that position.