Anti-Parfit’s Hitchhiker

k64Feb 4, 2022, 11:37 PM

2 points

World Modeling Timeless Decision Theory Perfect Predictor Newcomb's Problem

Thinking about Parfit’s Hitchhiker, an alternative example occurred to me:
You’re lost in the desert and this time Aul Peckman drives up and tells you “I will give you a ride back to town iff you would have stiffed my nemesis Paul Eckman.” After reading Parfit’s Hitchhiker, you had pre-committed to pay Paul Eckman if this happened to you, or chosen a decision theory that would cause you to do that, so you try telling Aul Peckman that you would stiff his nemesis in this situation, but he knows you’re lying and drives off. If only you weren’t so timelessly rational!

Obviously, one can argue that you’re more likely to encounter agents who will want to get paid than who will want you to not pay someone, and so if you’re in a world where that is true, you still have positive EV from running TDT/UDT, but is this an example of regretting TDT rationality?

k64Feb 4, 2022, 11:37 PM

2 points

3 comments1 min readLW link

World Modeling Timeless Decision Theory Perfect Predictor Newcomb's Problem

JBlack Feb 10, 2022, 1:12 AM
9 points
1
The correct TDT behaviour in a world in which Aul Peckman exists depends upon the relative probablities of having to rely on Aul or Paul to save you. If you believe that you are more likely to encounter Aul, then you should stiff Paul, and Aul will recognize that and drive you back.
If you encounter the wrong one, or if you didn’t know that Aul existed, then it’s just an uninteresting case of regret due to receiving new information about the world.
Vladimir_Nesov Feb 6, 2022, 7:01 AM
5 points
1
The decision depends on a priori probability of situations described in the thought experiments (two situations: PH and Anti-PH). The constraints force (winning in PH xor winning in Anti-PH), there are two options to choose from: (win in PH, lose in Anti-PH) and (lose in PH, win in Anti-PH). The value of either option is a weighted sum of values in PH and Anti-PH of its respective components, with weight given by their (relative) a priori probability. Since the payoffs in PH and Anti-PH are the same, the situation with higher probability calls the winning strategy overall.

More generally, for any situation described in a thought experiment, there is another thought experiment with negated payoffs. It doesn’t matter because by convention when a thought experiment is described, it’s implicitly given more a priori probability than all other related thought experiments combined.

So in case of Anti-PH given as a thought experiment, it would implicitly hold more weight than PH, thus the correct decision is to lose in PH. But in case of PH, the probabilities are the other way around, and the correct decision is to win in PH. The paradox is explained by these thought experiments not just being different possible situations, but implying different a priori distributions over all situations, including each other. They don’t exist in the same world, even though in their respective worlds the other situation is present, with lower a priori probability than in its own world.
Dacyn Feb 6, 2022, 12:32 AM
2 points
Sure, but this example is so trivial that it makes me think you haven’t fully understood the point of the Parfit’s Hitchhiker scenario. The point is that a CDT agent regrets its choices even when the entire setup is known to the agent beforehand, and even though the choices are the only thing that determine the outcome, not the CDT agent’s internal decisionmaking process. A TDT agent will never regret its choices given those constraints. So if you find those constraints to be aesthetically pleasing, you will find TDT to be aesthetically pleasing.