I get that that could work for a computer, because a computer can be bound by an overall decision theory without attempting to think about whether that decision theory still makes sense in the current situation.
I don’t mind predictors in eg Newcomb’s problem. Effectively, there is a backward causal arrow, because whatever you choose causes the predictor to have already acted differently. Unusual, but reasonable.
However, in this case, yes, your choice affects the predictor’s earlier decision—but since the coin never came down heads, who cares any more how the predictor would have acted? Why care about being the kind of person who will pay the counterfactual mugger, if there will never again be any opportunity for it to pay off?
If you want the payoff, you have to be the kind of person who will pay the counterfactual mugger, even once you no longer can benefit from doing so. Is that a reasonable feature for a decision theory to have? It’s not clear that it is; it seems strange to pay out, even though the expected value of becoming that kind of person is clearly positive before you see the coin. That’s what the counterfactual mugging is about.
If you’re asking “why care” rhetorically, and you believe the answer is “you shouldn’t be that kind of person”, then your decision theory prefers lower expected values, which is also pathological. How do you resolve that tension? This is, once again, literally the entire problem.
Well, as previously stated, my view is that the scenario as stated (single-shot with no precommitment) is not the most helpful hypothetical for designing a decision theory. An iterated version would actually be more relevant, since we want to design an AI that can make more than one decision. And in the iterated version, the tension is largely resolved, because there is a clear motivation to stick with the decision: we still hope for the next coin to come down heads.
Are you actually trying to understand? At some point you’ll predictably approach death, and predictably assign a vanishing probability to another offer or coin-flip coming after a certain point. Your present self should know this. Omega knows it by assumption.
I’m pretty sure that decision theories are not designed on that basis. We don’t want an AI to start making different decisions based on the probability of an upcoming decommission. We don’t want it to become nihilistic and stop making decisions because it predicted the heat death of the universe and decided that all paths have zero value. If death is actually tied to the decision in some way, then sure, take that into account, but otherwise, I don’t think a decision theory should have “death is inevitably coming for us all” as a factor.
I’m pretty sure that decision theories are not designed on that basis.
You are wrong. In fact, this is a totally standard thing to consider, and “avoid back-chaining defection in games of fixed length” is a known problem, with various known strategies.
I get that that could work for a computer, because a computer can be bound by an overall decision theory without attempting to think about whether that decision theory still makes sense in the current situation.
I don’t mind predictors in eg Newcomb’s problem. Effectively, there is a backward causal arrow, because whatever you choose causes the predictor to have already acted differently. Unusual, but reasonable.
However, in this case, yes, your choice affects the predictor’s earlier decision—but since the coin never came down heads, who cares any more how the predictor would have acted? Why care about being the kind of person who will pay the counterfactual mugger, if there will never again be any opportunity for it to pay off?
Yes, that is the problem in question!
If you want the payoff, you have to be the kind of person who will pay the counterfactual mugger, even once you no longer can benefit from doing so. Is that a reasonable feature for a decision theory to have? It’s not clear that it is; it seems strange to pay out, even though the expected value of becoming that kind of person is clearly positive before you see the coin. That’s what the counterfactual mugging is about.
If you’re asking “why care” rhetorically, and you believe the answer is “you shouldn’t be that kind of person”, then your decision theory prefers lower expected values, which is also pathological. How do you resolve that tension? This is, once again, literally the entire problem.
Well, as previously stated, my view is that the scenario as stated (single-shot with no precommitment) is not the most helpful hypothetical for designing a decision theory. An iterated version would actually be more relevant, since we want to design an AI that can make more than one decision. And in the iterated version, the tension is largely resolved, because there is a clear motivation to stick with the decision: we still hope for the next coin to come down heads.
Are you actually trying to understand? At some point you’ll predictably approach death, and predictably assign a vanishing probability to another offer or coin-flip coming after a certain point. Your present self should know this. Omega knows it by assumption.
I’m pretty sure that decision theories are not designed on that basis. We don’t want an AI to start making different decisions based on the probability of an upcoming decommission. We don’t want it to become nihilistic and stop making decisions because it predicted the heat death of the universe and decided that all paths have zero value. If death is actually tied to the decision in some way, then sure, take that into account, but otherwise, I don’t think a decision theory should have “death is inevitably coming for us all” as a factor.
You are wrong. In fact, this is a totally standard thing to consider, and “avoid back-chaining defection in games of fixed length” is a known problem, with various known strategies.