I’m not sure why I’m getting downmodded into oblivion here. I’ll go out on a limb and assume that I was being incomprehensible, even though I’ll be digging myself in deeper if that wasn’t the reason...
In classical game theory (subgame-perfect equilibrium), if you eat my chocolate, it is not rational for me to tweak your nose in retaliation at cost to myself. But if I can first commit myself to tweaking your nose if you eat my chocolate, it is no longer rational for you to eat it. But, if you can even earlier commit to definitely eating my chocolate even if I commit to then tweaking your nose, it is (still in classical game theory) no longer rational for me to commit to tweaking your nose! The early committer gets the good stuff.
Eliezer’s arguments have convinced me that a better decision theory would work like Vladimir says, acting as if you had made a commitment in all situations where you would like to make a commitment. But as far as I can see, both the nose-tweaker and the chocolate-eater can do that—speaking in intuitive human terms, it comes down to who is more stubborn. So what does happen? Is there a symmetry breaker? Can it happen that you commit to eating my chocolate, I commit to tweaking your nose, and we end up in the worst possible world for both of us? (Well, I’m pretty confident that that’s not what Eliezer’s theory (not shown) would do.)
Borrowing from classical game theory, perhaps we say that one of the two commitment scenarios happens, but we can’t say which (1. you eat my chocolate and I don’t tweak your nose; 2. you don’t eat my chocolate, which is a good thing because I would tweak your nose if you did). In the simple commitment game we’re considering here, this amounts to considering all Nash equilibria instead of only subgame perfect equilibria (Nash = “no player can do better by changing their strategy”—but I’m allowed to counterfactually tweak your nose at cost to myself if we don’t actually reach that part of the game tree at equilibrium). But of course, if you accept Eliezer’s arguments, Nash equilibrium is wrong in general, and in any case, it’s not obvious to me if “either of the two scenarios can happen” is the right solution to this game.
To make the implicit motivation behind these two comments explicit: I’m worried that there’s a danger of writing “the rightful owner will keep their chocolate” on the bottom line, noticing that a proper decision theory would allow them to retaliate, and saying “done!” without even considering whether the same logic allows the nefarious villain to spitefully commit to eating the chocolate anyhow. If the theory says that either of the two commitment outcomes may happen, ok, but I think it deserves mention. And if the theory says is something else, I want to know that too. :-)
You can’t argue with a rock, so you can’t stop a rock-solid commitment, even with your own rock-solid commitment. But you can solve the game given the commitments, with the outcome for each side. If this outcome is inferior to other possible commitments, then those other commitments should be used instead.
So, if the hero expects that his commitment to die will still result in villain making him die, this commitment is not a good idea and shouldn’t be made (for example, maybe the villain just wants to play the game). The tricky part is that if the hero expected his commitment to stop the villain, he still needs to dutifully die once the villain surprised him, to the extent this would be necessary to communicate the commitment to the villain prior to his decision, since it’s precisely this communicated model of behavior that was supposed to stop him.
I’m not sure why I’m getting downmodded into oblivion here. I’ll go out on a limb and assume that I was being incomprehensible, even though I’ll be digging myself in deeper if that wasn’t the reason...
In classical game theory (subgame-perfect equilibrium), if you eat my chocolate, it is not rational for me to tweak your nose in retaliation at cost to myself. But if I can first commit myself to tweaking your nose if you eat my chocolate, it is no longer rational for you to eat it. But, if you can even earlier commit to definitely eating my chocolate even if I commit to then tweaking your nose, it is (still in classical game theory) no longer rational for me to commit to tweaking your nose! The early committer gets the good stuff.
Eliezer’s arguments have convinced me that a better decision theory would work like Vladimir says, acting as if you had made a commitment in all situations where you would like to make a commitment. But as far as I can see, both the nose-tweaker and the chocolate-eater can do that—speaking in intuitive human terms, it comes down to who is more stubborn. So what does happen? Is there a symmetry breaker? Can it happen that you commit to eating my chocolate, I commit to tweaking your nose, and we end up in the worst possible world for both of us? (Well, I’m pretty confident that that’s not what Eliezer’s theory (not shown) would do.)
Borrowing from classical game theory, perhaps we say that one of the two commitment scenarios happens, but we can’t say which (1. you eat my chocolate and I don’t tweak your nose; 2. you don’t eat my chocolate, which is a good thing because I would tweak your nose if you did). In the simple commitment game we’re considering here, this amounts to considering all Nash equilibria instead of only subgame perfect equilibria (Nash = “no player can do better by changing their strategy”—but I’m allowed to counterfactually tweak your nose at cost to myself if we don’t actually reach that part of the game tree at equilibrium). But of course, if you accept Eliezer’s arguments, Nash equilibrium is wrong in general, and in any case, it’s not obvious to me if “either of the two scenarios can happen” is the right solution to this game.
To make the implicit motivation behind these two comments explicit: I’m worried that there’s a danger of writing “the rightful owner will keep their chocolate” on the bottom line, noticing that a proper decision theory would allow them to retaliate, and saying “done!” without even considering whether the same logic allows the nefarious villain to spitefully commit to eating the chocolate anyhow. If the theory says that either of the two commitment outcomes may happen, ok, but I think it deserves mention. And if the theory says is something else, I want to know that too. :-)
You can’t argue with a rock, so you can’t stop a rock-solid commitment, even with your own rock-solid commitment. But you can solve the game given the commitments, with the outcome for each side. If this outcome is inferior to other possible commitments, then those other commitments should be used instead.
So, if the hero expects that his commitment to die will still result in villain making him die, this commitment is not a good idea and shouldn’t be made (for example, maybe the villain just wants to play the game). The tricky part is that if the hero expected his commitment to stop the villain, he still needs to dutifully die once the villain surprised him, to the extent this would be necessary to communicate the commitment to the villain prior to his decision, since it’s precisely this communicated model of behavior that was supposed to stop him.