Option 2 is what I call “blindly importing related historical data as if it was a true description of your situation”. Clearly any model that says that the joint probability for your situation is identically equal to the empirical frequencies in any random data set is wrong.
Agreed that this is a bad idea. I think where we disagree is that I don’t see EDT as discouraging this. It doesn’t even throw a type error when you give it blindly imported related historical data! CDT encourages you to actually think about causality before making any decisions.
It’s about having a model that correctly generalises from observed data to predictions.
Note that decision theory does actually serve a slightly different role from a general prediction module, because it should be built specifically for counterfactual reasoning. The five-and-ten argument seems to be an example of this: if while observing another agent, you see them choose $5 over $10, it could be reasonable to update towards them preferring $5 to $10. If considering the hypothetical situation where you choose $5 instead of $10, it does not make sense to update towards yourself preferring $5 to $10, or to draw whatever conclusion you like by the principle of explosion.
that admittedly happens to be an elegant and convenient one.
Given that you can emulate one system using the other, I think that elegance and convenience are the criteria we should use to choose between them. Note that emulating a joint probability without causal knowledge using a causal network is trivial- you just use undirected edges for any correlations- but emulating a causal network using a joint probability is difficult.
Agreed that this is a bad idea. I think where we disagree is that I don’t see EDT as discouraging this. It doesn’t even throw a type error when you give it blindly imported related historical data! CDT encourages you to actually think about causality before making any decisions.
Note that decision theory does actually serve a slightly different role from a general prediction module, because it should be built specifically for counterfactual reasoning. The five-and-ten argument seems to be an example of this: if while observing another agent, you see them choose $5 over $10, it could be reasonable to update towards them preferring $5 to $10. If considering the hypothetical situation where you choose $5 instead of $10, it does not make sense to update towards yourself preferring $5 to $10, or to draw whatever conclusion you like by the principle of explosion.
Given that you can emulate one system using the other, I think that elegance and convenience are the criteria we should use to choose between them. Note that emulating a joint probability without causal knowledge using a causal network is trivial- you just use undirected edges for any correlations- but emulating a causal network using a joint probability is difficult.