Any mechanism to revoke or change a commitment is directly giving up value IN THE COMMON FORMULATION of the problem
Can you say more about what you mean by “giving up value”?
Our contention is that the ex-ante open-minded agent is not giving up (expected) value, in the relevant sense, when they “revoke their commitment” upon becoming aware of certain possible counterpart types. That is, they are choosing the course of action that would have been optimal according to the priors that they believe they should have set at the outset of the decision problem, had they been aware of everything they are aware of now. This captures an attractive form of deference — at the time it goes updateless / chooses its commitments, such an agent recognizes its lack of full awareness and defers to a version of itself that is aware of more considerations relevant to the decision problem.
As we say, the agent does make themselves exploitable in this way (and so “gives up value” to exploiters, with some probability). But they are still optimizing the right notion of expected value, in our opinion.
So I’d be interested to know what, more specifically, your disagreement with this perspective is. E.g., we briefly discuss a couple of alternatives (close-mindedness and awareness growth-unexploitable open-mindedness). If you think one of those is preferable I’d be keen to know why!
This model doesn’t seem to really specify the full ruleset that it’s optimizing for
Sorry that this isn’t clear from the post. I’m not sure which parts were unclear, but in brief: It’s a sequential game of Chicken in which the “predictor” moves first; the predictor can fully simulate the “agent’s” policy; there are two possible types of predictor (Normal, who best-responds to their prediction, and Crazy, who Dares no matter what); and the agent starts off unaware of the possibility of Crazy predictors, and only becomes aware of the possibility of Crazy types when they see the predictor Dare.
If a lack of clarity here is still causing confusion, maybe I can try to clarify further.
I also suspect you’re conflating updates of knowledge with strength and trustworthiness of commitment. It’s absolutely possible (and likely, in some formulations about timing and consistency) that a player can rationally make a commitment, and then later regret it, WITHOUT preferring at the time of commitment not to commit.
I’m not sure I understand your first sentence. I agree with the second sentence.
Any mechanism to revoke or change a commitment is directly giving up value IN THE COMMON FORMULATION of the problem
Can you say more about what you mean by “giving up value”?
Sure. In the common formulation https://en.wikipedia.org/wiki/Chicken_(game) , when Alice believes (with more than 1000:1 probability) that she is first mover against a rational opponent, she commits to Dare. The ability to revoke this commitment hurts her if her opponent commits in the meantime, as she is now better off swerving, but worse off than if her commitment had been (known to be) stronger.
For this to be wrong, the opponent must be (with some probability) irrational—that’s a HUGE change in the setup. Whether she wants to lose (by just always Swerve, regardless of opponent), or wait for more information about the opponent is based on her probability assessment of whether the opponent is actually irrational. If she assigns it 0% (correctly or in-), she should commit or she’s giving up expected value based on her current knowledge. If she assigns it higher than that, it will depend on the model of what is the distribution of opponents and THEIR commitment timing.
You can’t just say “Alice has wrong probability distributions, but she’s about to learn otherwise, so she should use that future information”. You COULD say “Alice knows her model is imperfect, so she should be somewhat conservative, but really that collapses to a different-but-still-specific probability distribution.
For this to be wrong, the opponent must be (with some probability) irrational—that’s a HUGE change in the setup
For one thing, we’re calling such agents “Crazy” in our example, but they need not be irrational. They might have weird preferences such that Dare is a dominant strategy. And as we say in a footnote, we might more realistically imagine more complex bargaining games, with agents who have (rationally) made commitments on the basis of as-yet unconceived of fairness principles, for example. An analogous discussion would apply to them.
But in any case, it seems like the theory should handle the possibility of irrational agents, too.
You can’t just say “Alice has wrong probability distributions, but she’s about to learn otherwise, so she should use that future information”. You COULD say “Alice knows her model is imperfect, so she should be somewhat conservative, but really that collapses to a different-but-still-specific probability distribution.
Here’s what I think you are saying: In addition to giving prior mass to the hypothesis that her counterpart is Normal, Alice can give prior mass to a catchall that says “the specific hypotheses I’ve thought of are all wrong”. Depending on the utilities she assigns to different policies given that the catchall is true, then she might not commit to Dare after all.
I agree that Alice can and should include a catchall in her reasoning, and that this could reduce the risk of bad commitments. But that doesn’t quite address the problem we are interested in here. There is still a question of what Alice should do once she becomes aware of the specific hypothesis that the predictor is Crazy. She could continue to evaluate her commitments from the perspective of her less-aware self, or she could do the ex-ante open-minded thing and evaluate commitments from the priors she should have had, had she been aware of the things she’s aware of now. These two approaches come apart in some cases, and we think that the latter is better.
I don’t see why EA-OMU agents should violate conservation of expected evidence (well, the version of the principle that is defined for the dynamic awareness setting).
I think if you fully specify the model (including the reasons for commitment rather than just delaying the decision in the first place), you’ll find that the reason for committing is NOT about updates, but about adversarial game theory. Specifically, include in your model that if facing a NORMAL opponent, failure to commit turns your (D, S) outcome (+1) into a (S, D) (-1), because the normal opponent will dare if you haven’t committed, and then you are best off swerving. You’ve LOST VALUE because you gave too much weight to the crazy opponent.
How your (distribution of) opponents react to your strategy, which is conditional on your beliefs about THEIR strategy is the core of game theory. If you have a mix of crazy opponents and rational opponents who you think haven’t committed yet, you don’t need to introduce any update mechanisms, you just need your current probability estimates about the distribution, and commit or don’t based on maximizing your EV.
Where the conservation of expected evidence comes in is that you CANNOT expect to increase your chances of facing a crazy opponent. If you did expect that, you actually have a different prior than you think.
The model is fully specified (again, sorry if this isn’t clear from the post). And in the model we can make perfectly precise the idea of an agent re-assessing their commitments from the perspective of a more-aware prior. Such an agent would disagree that they have lost value by revising their policy. Again, I’m not sure exactly where you are disagreeing with this. (You say something about giving too much weight to a crazy opponent — I’m not sure what “too much” means here.)
Re: conservation of expected evidence, the EA-OMU agent doesn’t expect to increase their chances of facing a crazy opponent. Indeed, they aren’t even aware of the possibility of crazy opponents at the beginning of the game, so I’m not sure what that would mean. (They may be aware that their awareness might grow in the future, but this doesn’t mean they expect their assessments of the expected value of different policies to change.) Maybe you misunderstand what we mean by “unawareness”?
The missing part is the ACTUAL distribution of normal vs crazy opponents (note that “crazy” is perfectly interchangeable with “normal, who was able to commit first”), and the loss that comes from failing to commit against a normal opponent. Or the reasoning that a normal opponent will see it as commitment, even when it’s not truly a commitment if the opponent turns out to be crazy.
Anyway, interesting discussion. I’m not certain I understand where we differ on it’s applicability, but I think we’ve hashed it out as much as possible. I’ll continue reading and thinking—feel free to respond or rebut, but I’m unlikely to comment further. Thanks!
Thanks Dagon:
Can you say more about what you mean by “giving up value”?
Our contention is that the ex-ante open-minded agent is not giving up (expected) value, in the relevant sense, when they “revoke their commitment” upon becoming aware of certain possible counterpart types. That is, they are choosing the course of action that would have been optimal according to the priors that they believe they should have set at the outset of the decision problem, had they been aware of everything they are aware of now. This captures an attractive form of deference — at the time it goes updateless / chooses its commitments, such an agent recognizes its lack of full awareness and defers to a version of itself that is aware of more considerations relevant to the decision problem.
As we say, the agent does make themselves exploitable in this way (and so “gives up value” to exploiters, with some probability). But they are still optimizing the right notion of expected value, in our opinion.
So I’d be interested to know what, more specifically, your disagreement with this perspective is. E.g., we briefly discuss a couple of alternatives (close-mindedness and awareness growth-unexploitable open-mindedness). If you think one of those is preferable I’d be keen to know why!
Sorry that this isn’t clear from the post. I’m not sure which parts were unclear, but in brief: It’s a sequential game of Chicken in which the “predictor” moves first; the predictor can fully simulate the “agent’s” policy; there are two possible types of predictor (Normal, who best-responds to their prediction, and Crazy, who Dares no matter what); and the agent starts off unaware of the possibility of Crazy predictors, and only becomes aware of the possibility of Crazy types when they see the predictor Dare.
If a lack of clarity here is still causing confusion, maybe I can try to clarify further.
I’m not sure I understand your first sentence. I agree with the second sentence.
Sure. In the common formulation https://en.wikipedia.org/wiki/Chicken_(game) , when Alice believes (with more than 1000:1 probability) that she is first mover against a rational opponent, she commits to Dare. The ability to revoke this commitment hurts her if her opponent commits in the meantime, as she is now better off swerving, but worse off than if her commitment had been (known to be) stronger.
For this to be wrong, the opponent must be (with some probability) irrational—that’s a HUGE change in the setup. Whether she wants to lose (by just always Swerve, regardless of opponent), or wait for more information about the opponent is based on her probability assessment of whether the opponent is actually irrational. If she assigns it 0% (correctly or in-), she should commit or she’s giving up expected value based on her current knowledge. If she assigns it higher than that, it will depend on the model of what is the distribution of opponents and THEIR commitment timing.
You can’t just say “Alice has wrong probability distributions, but she’s about to learn otherwise, so she should use that future information”. You COULD say “Alice knows her model is imperfect, so she should be somewhat conservative, but really that collapses to a different-but-still-specific probability distribution.
You don’t need to bring updates into it, and certainly don’t need to consider future updates. https://www.lesswrong.com/tag/conservation-of-expected-evidence means you can only expect any future update to match your priors.
For one thing, we’re calling such agents “Crazy” in our example, but they need not be irrational. They might have weird preferences such that Dare is a dominant strategy. And as we say in a footnote, we might more realistically imagine more complex bargaining games, with agents who have (rationally) made commitments on the basis of as-yet unconceived of fairness principles, for example. An analogous discussion would apply to them.
But in any case, it seems like the theory should handle the possibility of irrational agents, too.
Here’s what I think you are saying: In addition to giving prior mass to the hypothesis that her counterpart is Normal, Alice can give prior mass to a catchall that says “the specific hypotheses I’ve thought of are all wrong”. Depending on the utilities she assigns to different policies given that the catchall is true, then she might not commit to Dare after all.
I agree that Alice can and should include a catchall in her reasoning, and that this could reduce the risk of bad commitments. But that doesn’t quite address the problem we are interested in here. There is still a question of what Alice should do once she becomes aware of the specific hypothesis that the predictor is Crazy. She could continue to evaluate her commitments from the perspective of her less-aware self, or she could do the ex-ante open-minded thing and evaluate commitments from the priors she should have had, had she been aware of the things she’s aware of now. These two approaches come apart in some cases, and we think that the latter is better.
I don’t see why EA-OMU agents should violate conservation of expected evidence (well, the version of the principle that is defined for the dynamic awareness setting).
I think if you fully specify the model (including the reasons for commitment rather than just delaying the decision in the first place), you’ll find that the reason for committing is NOT about updates, but about adversarial game theory. Specifically, include in your model that if facing a NORMAL opponent, failure to commit turns your (D, S) outcome (+1) into a (S, D) (-1), because the normal opponent will dare if you haven’t committed, and then you are best off swerving. You’ve LOST VALUE because you gave too much weight to the crazy opponent.
How your (distribution of) opponents react to your strategy, which is conditional on your beliefs about THEIR strategy is the core of game theory. If you have a mix of crazy opponents and rational opponents who you think haven’t committed yet, you don’t need to introduce any update mechanisms, you just need your current probability estimates about the distribution, and commit or don’t based on maximizing your EV.
Where the conservation of expected evidence comes in is that you CANNOT expect to increase your chances of facing a crazy opponent. If you did expect that, you actually have a different prior than you think.
The model is fully specified (again, sorry if this isn’t clear from the post). And in the model we can make perfectly precise the idea of an agent re-assessing their commitments from the perspective of a more-aware prior. Such an agent would disagree that they have lost value by revising their policy. Again, I’m not sure exactly where you are disagreeing with this. (You say something about giving too much weight to a crazy opponent — I’m not sure what “too much” means here.)
Re: conservation of expected evidence, the EA-OMU agent doesn’t expect to increase their chances of facing a crazy opponent. Indeed, they aren’t even aware of the possibility of crazy opponents at the beginning of the game, so I’m not sure what that would mean. (They may be aware that their awareness might grow in the future, but this doesn’t mean they expect their assessments of the expected value of different policies to change.) Maybe you misunderstand what we mean by “unawareness”?
The missing part is the ACTUAL distribution of normal vs crazy opponents (note that “crazy” is perfectly interchangeable with “normal, who was able to commit first”), and the loss that comes from failing to commit against a normal opponent. Or the reasoning that a normal opponent will see it as commitment, even when it’s not truly a commitment if the opponent turns out to be crazy.
Anyway, interesting discussion. I’m not certain I understand where we differ on it’s applicability, but I think we’ve hashed it out as much as possible. I’ll continue reading and thinking—feel free to respond or rebut, but I’m unlikely to comment further. Thanks!