My probability is 0.9, and I shall base all my decisions on that. Why not update the pre-game plan according to that probability? Because the pre-game plan is not my decision. It is an agreement among all participants: a coordination achieved by everyone reasoning objectively.
Eliezer Yudkowsky’s post Outlawing Anthropics: An Updateless Dilemma brought up a paradox involving reflective inconsistency. It was originally constructed with anthropic terms but can also be formulated in non-anthropic context. Recently, Radford Neal and Ape in the coat discussed it in detail with different insights. Here I am presenting how my approach to the anthropic paradox—perspective based reasoning—would explain said problem.
The paradox in the non-anthropic context is as follows:
Twenty people take part in an experiment in which one of two urns is randomly chosen, each of the 20 people randomly takes a ball from the chosen urn, without knowledge of other’s balls. One of the urns contains 18 green balls and 2 red balls. The other urn contains 2 green balls and 18 red balls.
Each person who has a green ball decides whether to take part in a bet. If all the holders of green balls decide to take part in the bet, the group of 20 people collectively win $1 for each person who holds a green ball and lose $3 for each person who holds a red ball. The total wins and losses are divided equally among the 20 people at the end. If anyone with a green ball decides not to take the bet, the bet is off. (Some version of the game punishes all players grievously if the decisions among the green ball holders are different.)
Those people can come up with a coordination strategy before hand, but are each in their separate rooms once the game begin. How should they act?
The paradox is presented as follows: the combined payoff if the mostly-green-ball urn is chosen is $12, in comparison to negative $52 dollars if the mostly-red-ball urn is chosen. As they are equiprobable, the optimal strategy is clearly not to take the bet. However, if a participant received a green ball, he shall update the probability of mostly-green-ball urn from 0.5 to 0.9. Then the expected payoff of taking the bet would be 0.9×12+0.1×(-52) which is positive. So taking the bet would be the optimal choice, furthermore, “Since everyone with the green ball is in the exact situation as I am, we will reach the same decision” meaning the participants will all change their original strategy, and entering the bet to lose money, which is a departure from the pre-game plan.
Two Different Questions
Let’s not scrutinize the above logic for the moment. Instead, consider the follow run-of-the-mill probability questions:
Question 1A: Imagine from an objective gods-eye view, or, if it helps to conceptualize, imagine you are a non-participating outsider. What is the probability that the urn with mostly green balls gets chosen? Would you take a bet that reward you $12 if it is chosen but punish your for $52 if the urn with mostly red balls gets chosen instead?
Clearly the probability is 0.5 and no sane person would take the bet.
Question 1B: Now suppose you know there exists a green ball assigned to some unspecific participant. Would the probability and decision change?
There is no new information since it is already known there will exist some participant with green balls. So there is no change to either the probability nor the decision.
Question 2A: Suppose you are a participant in the game. Prior to drawing balls, what is the probability that the urn has mostly green balls? Would you take a bet that reward you, say $1, if the urn is filled mostly with green ball and punish you for, say $3, if the urn is mostly red?
Again, the probability is clearly 0.5 and there is no way I am taking the bet. (The exact number of pecuniary payout is not important, as long as they are of the rough ratio.)
Questions 2B: Now suppose I have drawn a green ball from the urn. Has the probability changed, what about my decision?
Basic Bayesian update would change my probability to 0.9 and entering the bet is the optimal decision.
The Erroneous Grafting
The above two problems are as simple as they get for probability questions, and the decisions are nothing more than vanilla causal decision theory can’t handle. The supposed reflective inconsistency in the paradox, however, is the result of mixing the two. It takes Question 1A and connected it to 2B then points at the supposed contradiction, while the contradiction is precisely caused by this sleight of hand. Specifically, the paradox switched a probability from an objective perspective with an ostensibly similar, yet categorically different, probability from the participant’s first-person perspective.
The question is setup to encourage the confounding of the two: From the outset, the payoff is aggregated for all participants. Even though the paradox take the long way to present it as paying $1 or taking away $3 for each person, “The total wins and losses are divided equally among the 20 people at the end”. So there is no need to consider any participant’s individual circumstances. The overall payoff is the only thing that needs so be optimized for the coordination strategy. Participants’ personal interest is conveniently set to coincide with it.
Then there is the problem of numerical difference: For Question 1, there is only one decision; for Question 2 multiple participants makes their individual decisions. To negate this problem, there has to be a rule transcribing the multiple participants’ decisions to the single coordination decision. So the paradox dictates “If anyone with a green ball decides not to take the bet, the bet is off.” This numerical difference is also why some versions feel compelled to add the additional premise such as “the game punishes all players grievously if the decisions from the green ball holders are different from one another.” It gives the participants a major incentive to avoid different answers.
One last step to enable the grafting is not part of the question setup, but a step in its analysis. That’s the assumption “Since everyone with the green ball is in the exact situation as I am, we will reach the same decision”. I want to be clear here. I am not suggesting this statement is necessarily, or even likely, wrong. I think the factual correctness of the statement is not pertinent for the paradox’s resolution. However, it does has the effect of blurring the distinction between the two different concepts. Let’s just leave it as that for the time being.
Both the setup points and the assumption are hinting toward the same idea which would make the incoming sleight of hand inconspicuous: They are hinting the personal decision and the coordination decision are identical: that their respective strategies, as well as the probabilities, are the same thing.
The Perspective Probability Difference
The question says the participants gather before the experiment to discuss a coordinating strategy. It is easy to see, because all participants are in symmetrical positions, the coordination strategy shall aim to maximize the overall payoff. As Questions 1A shows, the best action would be not taking the bet. So at least one person with green balls must say no to it. Because there is no communication afterwards, and the balls are randomly assigned, the strategy would dictate everybody saying no if presented with the choice. So, committing to coordinate, I shall say no to the bet.
I would venture to guess that’s how most people arrived at the pre-game plan. Notably it is derived not from any particular participant’s viewpoint but from an objective perspective (like that of the outsider’s). The participants all follow this one optimal strategy. Abiding to the coordination means you do not decide for yourself but carry out the common strategy. As if an objective outsider makes the decision, then makes the participants carrying out the moves. In contrast, imagine there is no coordination, my personal pre-game plan needs to be derived by maximizing my personal payoff (it may involve finding Nash equilibrium points, and saying “no” would still be one of the solutions). My guess is most of us did not derive the pre-game plan this way since the question asked for coordination.
The important thing here is to realize that those are two different decision processes, albeit likely recommending the same move for this instance.[1] The coordination strategy is based on the objective perspective, and, prescribes the move for all; whereas the personal decision is based on my first-person perspective, how others act is the input rather than output of the decision process. The paradox uses the former decision in the first half of its reasoning.
What should be the coordinating decision after the I get the green ball? Recall the coordination is derived from the objective viewpoint, self-identity is only meaningful for the first-person perspective (you can’t use “I” to specify a particular participant while reasoning objectively). Therefore the information available is some unspecified participant received a green ball, nothing that’s not known before. So, like Question 1B, the probability remains at 1⁄2 and there is no change to the coordination strategy of everyone saying no to the bet.
There is no denying that from the participant’s first-person perspective the probability changed to 0.9 like in Question 2B. But as I have discussed in perspective disagreement: two semantically identical questions from different perspectives are not the same question, and they can have different answers. The difference here has the same reason as my earlier example: Self specification is only meaningful for the first-person perspective. If anything, the current case is less dramatic than the my previous problem where two parties communicating, fully sharing all information, would nonetheless give different answers and be correct for both.
Switching perspective is the culprit of the inconsistency. Rather than analyzing the coordination strategy from the objective viewpoint, the paradox derived the probability of 0.9 from a participant’s perspective. Basing on this first-person probability, with the assumptionthat everyone else would act the same way as I do, the paradox described a collective behaviour for all participants and posit it as the new coordination strategy. While in reality, a coordination strategy should prescribe instead of describe.
Other Takes on the Paradox
In his post, Radford Neal has aptly demonstrated that the inconsistency disappears if we do not assume everyone in the same situation would reason and act the exact same way. I agree. This evaluation is correct. However, in my humble opinion, that is not because such assumption is unrealistic or possibly factual inaccurate. The real reason, which Radford Neal also pinpointed, is people tend to use that assumption and treat my personal decision and other participant’s decision as one. This acausal analysis would consider my personal strategy prescribes every participants’ action, confounding it for the coordination strategy. Without this acausal analysis, people won’t make the mistaken of using first-person probability in determining the coordination strategy.
I also want to call attention to the fact that after drawing a green ball, a personal strategy different from the pre-game plan is not a logical inconsistency. For instance, say you received a green ball, by assuming other participants would follow the pre-game plan and all say no to the bet, you derived your best move can be either saying yes or no. It makes no difference to your interest. This “yes or no” strategy is different from the “always say no” pre-game plan. Prof Neal said it is a little disappointing that the new plan do not exactly recommend the pre-game plan. But it doesn’t have to. The coordination strategy is still the same: “always say no.”. The difference is because you are only thinking for yourself now: no longer committed to coordinate. [2]
Ape in the coat’s solution is committed to the acausal analysis. He resolved the issue of inconsistency by proposing the validity of two different probabilities of which urn is chosen. In that aspect, it’s similar to the current post. However, Ape in the coat does not suppose the difference in probability comes from different perspectives and the subsequent difference between coordination or personal strategies. In fact, he specifically mentioned that self specification has nothing to do with probability or math in general. In contrast to the current post, he proposed both probabilities are valid because the ways of interpreting the fact of “I received a green ball” is depended on the betting schemes.
For instance, if a betting scheme’s payoff is based on individuals (like Question 2), then the correct way to interpret “I received a green ball” would be “a random person received a green ball” generating the probability of 0.9. However, for a betting scheme specified in the paradox, where the participants with green balls make the decision, the correct way to interpret “I received a green ball” must be “A person who’s always getting green balls received a green ball” which is no new info so the probability remains at 0.5.
I disagree with this approach for several reasons. First, this resolution implies probabilities are dependent on the context of betting schemes. It implies a reverse in reasoning: Instead of using the correct probability to generate correct decisions for bets, we ought to check what bets are offered and then work backward to get the correct probability.
But more importantly, I disagree with interpreting the first-person perspective (indexical such as “I’ or “now”) to objectively defined agents. As I have discussed in previous posts, perspective is primitive and there is no transcoding. People consistently try to use assumptions (e.g. SIA, SSA) to “explain away” the first-person perspective in anthropic problem because otherwise they would not be able to answer questions like self-locating probability. But it is accompanied with perspective switches that leads to paradoxes like the question presented here. For the current paradox, interpreting I as a decider who always sees green is akin to the logic of Self-Sampling Assumption, which I have long held against.
Because Perspective Based Reasoning consider the decisions from on different perspectives as distinct from one another, it has no problem recognizing different optimal strategies: A possible, though not very satisfying, solution to problems such as the Newcomb paradox.
We can even have a case that the correct personal strategy is completely opposite to the pregame plan: For instance, after drawing a green ball, say, you assumed others would all say yes to the bet. Then using the first-person probability of 0.9, you concluded the best act for yourself is to accept the bet. As it turns out, others did say yes, maybe even based on the exact reasoning that you had. So everyone’s assumption about other’s choice are factually correct, and everyone’s decision is the best decision for themselves. But, one may ask, isn’t everyone worse off? Isn’t this a reflective inconsistency? No. The optimal coordination strategy hasn’t changed, if participants were committed to it, they would keep saying no. The change in decision is because they are no longer coordinating like when they were while making the pre-game plan. A everyone-for-themselves situation could be worse for everyone than cooperation, even when they all made the best decision for themselves. And in this case it’s actualized with an rather unfortunate, nevertheless correct, assumption of other’s actions.
The Perspective-based Explanation to the Reflective Inconsistency Paradox
Eliezer Yudkowsky’s post Outlawing Anthropics: An Updateless Dilemma brought up a paradox involving reflective inconsistency. It was originally constructed with anthropic terms but can also be formulated in non-anthropic context. Recently, Radford Neal and Ape in the coat discussed it in detail with different insights. Here I am presenting how my approach to the anthropic paradox—perspective based reasoning—would explain said problem.
The paradox in the non-anthropic context is as follows:
The paradox is presented as follows: the combined payoff if the mostly-green-ball urn is chosen is $12, in comparison to negative $52 dollars if the mostly-red-ball urn is chosen. As they are equiprobable, the optimal strategy is clearly not to take the bet. However, if a participant received a green ball, he shall update the probability of mostly-green-ball urn from 0.5 to 0.9. Then the expected payoff of taking the bet would be 0.9×12+0.1×(-52) which is positive. So taking the bet would be the optimal choice, furthermore, “Since everyone with the green ball is in the exact situation as I am, we will reach the same decision” meaning the participants will all change their original strategy, and entering the bet to lose money, which is a departure from the pre-game plan.
Two Different Questions
Let’s not scrutinize the above logic for the moment. Instead, consider the follow run-of-the-mill probability questions:
Clearly the probability is 0.5 and no sane person would take the bet.
There is no new information since it is already known there will exist some participant with green balls. So there is no change to either the probability nor the decision.
Again, the probability is clearly 0.5 and there is no way I am taking the bet. (The exact number of pecuniary payout is not important, as long as they are of the rough ratio.)
Basic Bayesian update would change my probability to 0.9 and entering the bet is the optimal decision.
The Erroneous Grafting
The above two problems are as simple as they get for probability questions, and the decisions are nothing more than vanilla causal decision theory can’t handle. The supposed reflective inconsistency in the paradox, however, is the result of mixing the two. It takes Question 1A and connected it to 2B then points at the supposed contradiction, while the contradiction is precisely caused by this sleight of hand. Specifically, the paradox switched a probability from an objective perspective with an ostensibly similar, yet categorically different, probability from the participant’s first-person perspective.
The question is setup to encourage the confounding of the two: From the outset, the payoff is aggregated for all participants. Even though the paradox take the long way to present it as paying $1 or taking away $3 for each person, “The total wins and losses are divided equally among the 20 people at the end”. So there is no need to consider any participant’s individual circumstances. The overall payoff is the only thing that needs so be optimized for the coordination strategy. Participants’ personal interest is conveniently set to coincide with it.
Then there is the problem of numerical difference: For Question 1, there is only one decision; for Question 2 multiple participants makes their individual decisions. To negate this problem, there has to be a rule transcribing the multiple participants’ decisions to the single coordination decision. So the paradox dictates “If anyone with a green ball decides not to take the bet, the bet is off.” This numerical difference is also why some versions feel compelled to add the additional premise such as “the game punishes all players grievously if the decisions from the green ball holders are different from one another.” It gives the participants a major incentive to avoid different answers.
One last step to enable the grafting is not part of the question setup, but a step in its analysis. That’s the assumption “Since everyone with the green ball is in the exact situation as I am, we will reach the same decision”. I want to be clear here. I am not suggesting this statement is necessarily, or even likely, wrong. I think the factual correctness of the statement is not pertinent for the paradox’s resolution. However, it does has the effect of blurring the distinction between the two different concepts. Let’s just leave it as that for the time being.
Both the setup points and the assumption are hinting toward the same idea which would make the incoming sleight of hand inconspicuous: They are hinting the personal decision and the coordination decision are identical: that their respective strategies, as well as the probabilities, are the same thing.
The Perspective Probability Difference
The question says the participants gather before the experiment to discuss a coordinating strategy. It is easy to see, because all participants are in symmetrical positions, the coordination strategy shall aim to maximize the overall payoff. As Questions 1A shows, the best action would be not taking the bet. So at least one person with green balls must say no to it. Because there is no communication afterwards, and the balls are randomly assigned, the strategy would dictate everybody saying no if presented with the choice. So, committing to coordinate, I shall say no to the bet.
I would venture to guess that’s how most people arrived at the pre-game plan. Notably it is derived not from any particular participant’s viewpoint but from an objective perspective (like that of the outsider’s). The participants all follow this one optimal strategy. Abiding to the coordination means you do not decide for yourself but carry out the common strategy. As if an objective outsider makes the decision, then makes the participants carrying out the moves. In contrast, imagine there is no coordination, my personal pre-game plan needs to be derived by maximizing my personal payoff (it may involve finding Nash equilibrium points, and saying “no” would still be one of the solutions). My guess is most of us did not derive the pre-game plan this way since the question asked for coordination.
The important thing here is to realize that those are two different decision processes, albeit likely recommending the same move for this instance.[1] The coordination strategy is based on the objective perspective, and, prescribes the move for all; whereas the personal decision is based on my first-person perspective, how others act is the input rather than output of the decision process. The paradox uses the former decision in the first half of its reasoning.
What should be the coordinating decision after the I get the green ball? Recall the coordination is derived from the objective viewpoint, self-identity is only meaningful for the first-person perspective (you can’t use “I” to specify a particular participant while reasoning objectively). Therefore the information available is some unspecified participant received a green ball, nothing that’s not known before. So, like Question 1B, the probability remains at 1⁄2 and there is no change to the coordination strategy of everyone saying no to the bet.
There is no denying that from the participant’s first-person perspective the probability changed to 0.9 like in Question 2B. But as I have discussed in perspective disagreement: two semantically identical questions from different perspectives are not the same question, and they can have different answers. The difference here has the same reason as my earlier example: Self specification is only meaningful for the first-person perspective. If anything, the current case is less dramatic than the my previous problem where two parties communicating, fully sharing all information, would nonetheless give different answers and be correct for both.
Switching perspective is the culprit of the inconsistency. Rather than analyzing the coordination strategy from the objective viewpoint, the paradox derived the probability of 0.9 from a participant’s perspective. Basing on this first-person probability, with the assumption that everyone else would act the same way as I do, the paradox described a collective behaviour for all participants and posit it as the new coordination strategy. While in reality, a coordination strategy should prescribe instead of describe.
Other Takes on the Paradox
In his post, Radford Neal has aptly demonstrated that the inconsistency disappears if we do not assume everyone in the same situation would reason and act the exact same way. I agree. This evaluation is correct. However, in my humble opinion, that is not because such assumption is unrealistic or possibly factual inaccurate. The real reason, which Radford Neal also pinpointed, is people tend to use that assumption and treat my personal decision and other participant’s decision as one. This acausal analysis would consider my personal strategy prescribes every participants’ action, confounding it for the coordination strategy. Without this acausal analysis, people won’t make the mistaken of using first-person probability in determining the coordination strategy.
I also want to call attention to the fact that after drawing a green ball, a personal strategy different from the pre-game plan is not a logical inconsistency. For instance, say you received a green ball, by assuming other participants would follow the pre-game plan and all say no to the bet, you derived your best move can be either saying yes or no. It makes no difference to your interest. This “yes or no” strategy is different from the “always say no” pre-game plan. Prof Neal said it is a little disappointing that the new plan do not exactly recommend the pre-game plan. But it doesn’t have to. The coordination strategy is still the same: “always say no.”. The difference is because you are only thinking for yourself now: no longer committed to coordinate. [2]
Ape in the coat’s solution is committed to the acausal analysis. He resolved the issue of inconsistency by proposing the validity of two different probabilities of which urn is chosen. In that aspect, it’s similar to the current post. However, Ape in the coat does not suppose the difference in probability comes from different perspectives and the subsequent difference between coordination or personal strategies. In fact, he specifically mentioned that self specification has nothing to do with probability or math in general. In contrast to the current post, he proposed both probabilities are valid because the ways of interpreting the fact of “I received a green ball” is depended on the betting schemes.
For instance, if a betting scheme’s payoff is based on individuals (like Question 2), then the correct way to interpret “I received a green ball” would be “a random person received a green ball” generating the probability of 0.9. However, for a betting scheme specified in the paradox, where the participants with green balls make the decision, the correct way to interpret “I received a green ball” must be “A person who’s always getting green balls received a green ball” which is no new info so the probability remains at 0.5.
I disagree with this approach for several reasons. First, this resolution implies probabilities are dependent on the context of betting schemes. It implies a reverse in reasoning: Instead of using the correct probability to generate correct decisions for bets, we ought to check what bets are offered and then work backward to get the correct probability.
But more importantly, I disagree with interpreting the first-person perspective (indexical such as “I’ or “now”) to objectively defined agents. As I have discussed in previous posts, perspective is primitive and there is no transcoding. People consistently try to use assumptions (e.g. SIA, SSA) to “explain away” the first-person perspective in anthropic problem because otherwise they would not be able to answer questions like self-locating probability. But it is accompanied with perspective switches that leads to paradoxes like the question presented here. For the current paradox, interpreting I as a decider who always sees green is akin to the logic of Self-Sampling Assumption, which I have long held against.
Because Perspective Based Reasoning consider the decisions from on different perspectives as distinct from one another, it has no problem recognizing different optimal strategies: A possible, though not very satisfying, solution to problems such as the Newcomb paradox.
We can even have a case that the correct personal strategy is completely opposite to the pregame plan: For instance, after drawing a green ball, say, you assumed others would all say yes to the bet. Then using the first-person probability of 0.9, you concluded the best act for yourself is to accept the bet. As it turns out, others did say yes, maybe even based on the exact reasoning that you had. So everyone’s assumption about other’s choice are factually correct, and everyone’s decision is the best decision for themselves. But, one may ask, isn’t everyone worse off? Isn’t this a reflective inconsistency? No. The optimal coordination strategy hasn’t changed, if participants were committed to it, they would keep saying no. The change in decision is because they are no longer coordinating like when they were while making the pre-game plan. A everyone-for-themselves situation could be worse for everyone than cooperation, even when they all made the best decision for themselves. And in this case it’s actualized with an rather unfortunate, nevertheless correct, assumption of other’s actions.