why can’t we just imagine that you are an agent that doesn’t care about counterfactual selves?
Caring about counterfactual selves is part of UDT, though. If you simply assume that it doesn’t hold, and ask proponents of UDT to argue under that assumption, I’m not sure there’s a good answer.
No, not like that. I think there is an argument for caring about counterfactual selves. But it cannot be carried out from the assumption that the agent doesn’t care about counterfactual selves. You’re just asking me to do something impossible.
I guess my argument is based on imagining at the start that agents either can care about counterfactual selves or not. But agents that don’t are a bit controversial, so let’s imagine such an agent and see if we run into any issues. So imagine a consistent agent that doesn’t care about counterfactual selves except insofar as they “could be it” from its current epistemic position. I can’t see any issues with this—it seems consistent. And my challenge is for you to answer why this isn’t a valid set of values to have.
Let’s imagine a kind of symmetric counterfactual mugging. In case of heads, Omega says: “The coin came up heads, now you can either give me $100 or refuse. After that, I’ll give you $10000 if you would’ve given me $100 in case of tails”. In case of tails, Omega says the same thing, but with heads and tails reversed. In this situation, an agent who doesn’t care about counterfactual selves always gets 0 regardless of the coin, while an agent who does care always gets $9900 regardless of the coin.
I can’t think of any situation where the opposite happens (the non-caring agent gets more with certainty). To me that suggests the caring agent is more rational.
Yeah, I actually stumbled upon this argument myself this morning. Has anyone written this up beyond this comment as this seems like the most persuasive argument for paying? This suggests that never caring is not a viable position.
I was thinking today about whether there are any intermediate positions, but I don’t think they are viable. Only caring about counterfactuals when you have a prisoner’s dilemma-like situation seems an unprincipled fudge.
Do you think you’ll write a post on it? Because I was thinking of writing a post, but if you were planning on doing this then that would be even better as it would probably get more attention.
In this situation, an agent who doesn’t care about counterfactual selves always gets 0 regardless of the coin
Since the agent is very correlated with its counterfactual copy, it seems that superrationality (or even just EDT) would make the agent pay $100, and get the $10000.
I just thought of another argument. Imagine that before being faced with counterfactual mugging, the agent can make a side bet on Omega’s coin. Let’s say the agent who doesn’t care about counterfactual selves chooses to bet X dollars on heads, so the income is X in case of heads and -X in case of tails. Then the agent who cares about counterfactual selves can bet X-5050 on heads (or if that’s negative, bet 5050-X on tails). Since this agent agrees to pay Omega, the income will be 10000+X-5050=4950+X in case of heads, and 5050-X-100=4950-X in case of tails. So in both cases the caring agent gets 4950 dollars more than the non-caring agent. And the opposite is impossible: no matter how the two agents bet, the caring agent always gets more in at least one of the cases.
“Imagine that before being faced with counterfactual mugging, the agent can make a side bet on Omega’s coin”—I don’t know if that works. Part of counterfactual mugging is that you aren’t told before the problem that you might be mugged, otherwise you could just pre-commit.
Caring about counterfactual selves is part of UDT, though. If you simply assume that it doesn’t hold, and ask proponents of UDT to argue under that assumption, I’m not sure there’s a good answer.
Interesting. Do you taken caring about counterfactual selves as foundational—in the sense that there is no why, you either do or do not?
No, not like that. I think there is an argument for caring about counterfactual selves. But it cannot be carried out from the assumption that the agent doesn’t care about counterfactual selves. You’re just asking me to do something impossible.
I guess my argument is based on imagining at the start that agents either can care about counterfactual selves or not. But agents that don’t are a bit controversial, so let’s imagine such an agent and see if we run into any issues. So imagine a consistent agent that doesn’t care about counterfactual selves except insofar as they “could be it” from its current epistemic position. I can’t see any issues with this—it seems consistent. And my challenge is for you to answer why this isn’t a valid set of values to have.
Let’s imagine a kind of symmetric counterfactual mugging. In case of heads, Omega says: “The coin came up heads, now you can either give me $100 or refuse. After that, I’ll give you $10000 if you would’ve given me $100 in case of tails”. In case of tails, Omega says the same thing, but with heads and tails reversed. In this situation, an agent who doesn’t care about counterfactual selves always gets 0 regardless of the coin, while an agent who does care always gets $9900 regardless of the coin.
I can’t think of any situation where the opposite happens (the non-caring agent gets more with certainty). To me that suggests the caring agent is more rational.
Yeah, I actually stumbled upon this argument myself this morning. Has anyone written this up beyond this comment as this seems like the most persuasive argument for paying? This suggests that never caring is not a viable position.
I was thinking today about whether there are any intermediate positions, but I don’t think they are viable. Only caring about counterfactuals when you have a prisoner’s dilemma-like situation seems an unprincipled fudge.
Yeah. I don’t remember seeing this argument before, it just came to my mind today.
Do you think you’ll write a post on it? Because I was thinking of writing a post, but if you were planning on doing this then that would be even better as it would probably get more attention.
No, wasn’t planning. Go ahead and write the post, and maybe link to my comment as independent discovery.
Of course
Since the agent is very correlated with its counterfactual copy, it seems that superrationality (or even just EDT) would make the agent pay $100, and get the $10000.
Actually, the counterfactual agent makes a different observation (heads instead of tails) so their actions aren’t necessarily linked
I just thought of another argument. Imagine that before being faced with counterfactual mugging, the agent can make a side bet on Omega’s coin. Let’s say the agent who doesn’t care about counterfactual selves chooses to bet X dollars on heads, so the income is X in case of heads and -X in case of tails. Then the agent who cares about counterfactual selves can bet X-5050 on heads (or if that’s negative, bet 5050-X on tails). Since this agent agrees to pay Omega, the income will be 10000+X-5050=4950+X in case of heads, and 5050-X-100=4950-X in case of tails. So in both cases the caring agent gets 4950 dollars more than the non-caring agent. And the opposite is impossible: no matter how the two agents bet, the caring agent always gets more in at least one of the cases.
“Imagine that before being faced with counterfactual mugging, the agent can make a side bet on Omega’s coin”—I don’t know if that works. Part of counterfactual mugging is that you aren’t told before the problem that you might be mugged, otherwise you could just pre-commit.