The exposition of meta-probability is well done, and shows an interesting way of examining and evaluating scenarios. However, I would take issue with the first section of this article in which you establish single probability (expected utility) calculations as insufficient for the problem, and present meta-probability as the solution.
In particular, you say
What’s interesting is that, when you have to decide whether or not to gamble your first coin, the probability is exactly the same in the two cases (p=0.45 of a $2 payout). However, the rational course of action is different. What’s up with that?
Here, a single probability value fails to capture everything you know about an uncertain event. And, it’s a case in which that failure matters.
I do not believe that this is a failure of applying a single probability to the situation, but merely calculating the probability wrongly, by ignoring future effects of your choice. I think this is most clearly illustrated by scaling the problem down to the case where you are handed a green box, and only two coins. In this simplified problem, we can clearly examine all possible strategies.
Strategy 1 would be to hold on to your two dollar coins. There is a 100% chance of a $2.00 payout
Strategy 2 would be to insert both of your coins into the box. There is a 50.5% chance of a $0.00 payout, 40.5% chance of a $4.00 payout and a 9% chance of a $2.00 payout.
Strategy 3 would be to insert one coin, and then insert the second only if the first pays out. There is a 55% chance of $1.00 payout, a 4.5% chance of a $2.00 payout, and a 40.5% chance of a $4.00 payout.
Strategy 4 would be to insert one coin, and then insert the second only if the first doesn’t pay out. There is a 50.5% chance of a 0.00$ payout, a 4.5% chance of a $2.00 payout, and a 45% chance of a $3.00 payout.
When put in these terms, it seems quite obvious that your choice to open the box would depend on more than the expected payoff from only the first box, because quite clearly your choice to open the first box pays off (or doesn’t pay off) when opening (or not opening) the other boxes as well. This seems like an error in calculating the payoff matrix rather than a flaw with the technique of single probability values itself. It ignores the fact that opening the first box not only pays you off immediately, but also pays you off in the future by giving you information about the other boxes.
This problem easily succumbs to standard expected value calculations if all actions are considered. The steps remain the same as always:
Assign a utility to each dollar amount outcome
Calculate the expected utility of all possible strategies
Choose the strategy with the highest expected utility
In the case of two coins, we were able to trivially calculate the outcomes of all possible strategies, but in larger instances of the problem, it might be advisable to use shortcuts in the calculations. However, it still remains true that the best choice will still be the one you would have gotten if you had done out the full expected value calculation.
I think the confusion arises because a lot of the time problems are presented in a way that screens them off from the rest of the world. For example, you are given a box, and it either has $10.00 or $100.00. Once you open the box, the only effect it has on you is the amount of money you got. After you get the money, the box does not matter to the rest of the world. Problems are presented this way so that it is easy to factor out the decisions and calculations you have to make from every other decision you have to make. However, decision are not necessarily this way (in fact in real life, very few decisions are). In the choice of inserting the first coin or not, this is simply not the case, despite having superficial similarities to standard “box” problems.
Although you clearly understand that the payoffs from the boxes are entangled, you only apply this knowledge in your informal approach to the problem. The failure to consider the full effects of your actions in opening the first box may be psychologically encouraged by the technique of “single probability calculations”, but it is certainly not a failure of the technique itself to capture such situations.
The substantive point here isn’t about EU calculations per se. Running a full analysis of everything that might happen and doing an EU calculation on that basis is fine, and I don’t think the OP disputes this.
The subtlety is about what numerical data can formally represent your full state of knowledge. The claim is that a mere probability of getting the $2 payout does not. It’s the case that on the first use of a box, the probability of the payout given its colour is 0.45 regardless of the colour.
However, if you merely hold onto that probability, then if you put in a coin and so learn something about the boxes you can’t update that probability to figure out what the probability of payout for the second attempt is. You need to go back and also remember whether the box is green or brown. The point of Jaynes and the A_p distribution is that it actually does screen off all other information. If you keep track of it you never need to worry about remembering the colour of the box, or the setup of the experiment. Just this “meta-distribution”.
The subtlety is about what numerical data can formally represent your full state of knowledge. The claim is that a mere probability of getting the $2 payout does not.
However, a single probability for each outcome given each strategy is all the information needed. The problem is not with using single probabilities to represent knowledge about the world, it’s the straw math that was used to represent the technique. To me, this reasoning is equivalent to the following:
“You work at a store where management is highly disorganized. Although they precisely track the number of days you have worked since the last payday, they never remember when they last paid you, and thus every day of the work week has a 1⁄5 chance of being a payday. For simplicity’s sake, let’s assume you earn $100 a day.
You wake up on Monday and do the following calculation: If you go in to work, you have a 1⁄5 chance of being paid. Thus the expected payoff of working today is $20, which is too low for it to be worth it. So you skip work. On Tuesday, you make the same calculation, and decide that it’s not worth it to work again, and so you continue forever.
I visit you and immediately point out that you’re being irrational. After all, a salary of $100 a day clearly is worth it to you, yet you are not working. I look at your calculations, and immediately find the problem: You’re using a single probability to represent your expected payoff from working! I tell you that using a meta-probability distribution fixes this problem, and so you excitedly scrap your previous calculations and set about using a meta-probability distribution instead. We decide that a Gaussian sharply peaked at 0.2 best represents our meta-probability distribution, and I send you on your way.”
Of course, in this case, the meta-probability distribution doesn’t change anything. You still continue skipping work, because I have devised the hypothetical situation to illustrate my point (evil laugh). The point is that in this problem the meta-probability distribution solves nothing, because the problem is not with a lack of meta-probability, but rather a lack of considering future consequences.
In both the OPs example and mine, the problem is that the math was done incorrectly, not that you need meta-probabilities. As you said, meta-probabilities are a method of screening off additional labels on your probability distributions for a particular class of problems where you are taking repeated samples that are entangled in a very particular sort of way. As I said above, I appreciate the exposition of meta-probabilities as a tool, and your comment as well has helped me better understand their instrumental nature, but I take issue with what sort of tool they are presented as.
If you do the calculations directly with the probabilities, your calculation will succeed if you do the math right, and fail if you do the math wrong. Meta-probabilities are a particular way of representing a certain calculation that succeed and fail on their own right. If you use them to represent the correct direct probabilities, you will get the right answer, but they are only an aid in the calculation, they never fix any problem with direct probability calculations. The fixing of the calculation and the use of probabilities are orthogonal issues.
To make a blunt analogy, this is like someone trying to plug an Ethernet cable into a phone jack, and then saying “when Ethernet fails, wifi works”, conveniently plugging in the wifi adapter correctly.
The key of the dispute in my eyes is not whether wifi can work for certain situations, but whether there’s anything actually wrong with Ethernet in the first place.
So, my observation is that without meta-distributions (or A_p), or conditioning on a pile of past information (and thus tracking /more/ than just a probability distribution over current outcomes), you don’t have the room in your knowledge to be able to even talk about sensitivity to new information coherently. Once you can talk about a complete state of knowledge, you can begin to talk about the utility of long term strategies.
For example, in your example, one would have the same probability of being paid today if 20% of employers actually pay you every day, whilst 80% of employers never paid you. But in such an environment, it would not make sense to work a second day in 80% of cases. The optimal strategy depends on what you know, and to represent that in general requires more than a straight probability.
There are different problems coming from the distinction between choosing a long term policy to follow, and choosing a one shot action. But we can’t even approach this question in general unless we can talk sensibly about a sufficient set of information to keep track of about. There are two distinct problems, one prior to the other.
Jaynes does discuss a problem which is closer to your concerns (that of estimating neutron multiplication in a 1-d experiment 18.15, pp579. He’s comparing two approaches, which for my purposes differ in their prior A_p distribution.
Jeremy, I think the apparent disagreement here is due to unclarity about what the point of my argument was. The point was not that this situation can’t be analyzed with decision theory; it certainly can, and I did so. The point is that different decisions have to be made in two situations where the probabilities are the same.
Your discussion seems to equate “probability” with “utility”, and the whole point of the example is that, in this case, they are not the same.
While there are sets of probabilities which by themselves are not adequate to capture the information about a decision, there always is a set of probabilities which is adequate to capture the information about a decision.
In that sense I do not see your article as an argument against using probabilities to represent decision information, but rather a reminder to use the correct set of probabilities.
In that sense I do not see your article as an argument against using probabilities to represent decision information, but rather a reminder to use the correct set of probabilities.
My understanding of Chapman’s broader point (which may differ wildly from his understanding) is that determining which set of probabilities is correct for a situation can be rather hard, and so it deserves careful and serious study from people who want to think about the world in terms of probabilities.
Thanks, Jonathan, yes, that’s how I understand it.
Jaynes’ discussion motivates A_p as an efficiency hack that allows you to save memory by forgetting some details. That’s cool, although not the point I’m trying to make here.
I do not believe that this is a failure of applying a single probability to the situation, but merely calculating the probability wrongly
A single probability cannot sum up our knowledge.
Before we talk about plans, as you went on to, we must talk about the world as it stands. We know there is a 50% chance of a 0% machine and a 50% chance of a 90% machine. Saying 45% does not encode this information. No other number does either.
Scalar probabilities of binary outcomes are such a useful hammer that we need to stop and remember sometimes that not all uncertainties are nails.
Jeremy, thank you for this. To be clear, I wasn’t suggesting that meta-probability is the solution. It’s a solution. I chose it because I plan to use this framework in later articles, where it will (I hope) be particularly illuminating.
I would take issue with the first section of this article in which you establish single probability (expected utility) calculations as insufficient for the problem.
I don’t think it’s correct to equate probability with expected utility, as you seem to do here. The probability of a payout is the same in the two situations. The point of this example is that the probability of a particular event does not determine the optimal strategy. Because utility is dependent on your strategy, that also differs.
This problem easily succumbs to standard expected value calculations if all actions are considered.
Yes, absolutely! I chose a particularly simple problem, in which the correct decision-theoretic analysis is obvious, in order to show that probability does not always determine optimal strategy. In this case, the optimal strategies are clear (except for the exact stopping condition), and clearly different, even though the probabilities are the same.
I’m using this as an introductory wedge example. I’ve opened a Pandora’s Box: probability by itself is not a fully adequate account of rationality. Many odd things will leap and creep out of that box so long as we leave it open.
I don’t think it’s correct to equate probability with expected utility, as you seem to do here. The probability of a payout is the same in the two situations. The point of this example is that the probability of a particular event does not determine the optimal strategy. Because utility is dependent on your strategy, that also differs.
Hmmm. I was equating them as part of the standard technique of calculating the probability of outcomes from your actions, and then from there multiplying by the utilities of the outcomes and summing to find the expected utility of a given action.
I think it’s just a question of what you think the error is in the original calculation. I find the error to be the conflation of “payout” (as in immediate reward from inserting the coin) with “payout” (as in the expected reward from your action including short term and long-term rewards). It seems to me that you are saying that you can’t look at the immediate probability of payout
The point of this example is that the probability of a particular event does not determine the optimal strategy. Because utility is dependent on your strategy, that also differs.
which I agree with. But you seem to ignore the obvious solution of considering the probability of total payout, including considerations about your strategy. In that case, you really do have a single probability representing the likelihood of a single outcome, and you do get the correct answer. So I don’t see where the issue with using a single probability comes from. It seems to me an issue with using the wrong single probability.
And especially troubling is that you seem to agree that using direct probabilities to calculate the single probability of each outcome and then weighing them by desirability will give you the correct answer, but then you say
probability by itself is not a fully adequate account of rationality.
which may be true, but I don’t think is demonstrated at all by this example.
I don’t think is demonstrated at all by this example.
Yes, I see your point (although I don’t altogether agree). But, again, what I’m doing here is setting up analytical apparatus that will be helpful for more difficult cases later.
In the mean time, the LW posts I pointed to here may motivate more strongly the claim that probability alone is an insufficient guide to action.
The exposition of meta-probability is well done, and shows an interesting way of examining and evaluating scenarios. However, I would take issue with the first section of this article in which you establish single probability (expected utility) calculations as insufficient for the problem, and present meta-probability as the solution.
In particular, you say
I do not believe that this is a failure of applying a single probability to the situation, but merely calculating the probability wrongly, by ignoring future effects of your choice. I think this is most clearly illustrated by scaling the problem down to the case where you are handed a green box, and only two coins. In this simplified problem, we can clearly examine all possible strategies.
Strategy 1 would be to hold on to your two dollar coins. There is a 100% chance of a $2.00 payout
Strategy 2 would be to insert both of your coins into the box. There is a 50.5% chance of a $0.00 payout, 40.5% chance of a $4.00 payout and a 9% chance of a $2.00 payout.
Strategy 3 would be to insert one coin, and then insert the second only if the first pays out. There is a 55% chance of $1.00 payout, a 4.5% chance of a $2.00 payout, and a 40.5% chance of a $4.00 payout.
Strategy 4 would be to insert one coin, and then insert the second only if the first doesn’t pay out. There is a 50.5% chance of a 0.00$ payout, a 4.5% chance of a $2.00 payout, and a 45% chance of a $3.00 payout.
When put in these terms, it seems quite obvious that your choice to open the box would depend on more than the expected payoff from only the first box, because quite clearly your choice to open the first box pays off (or doesn’t pay off) when opening (or not opening) the other boxes as well. This seems like an error in calculating the payoff matrix rather than a flaw with the technique of single probability values itself. It ignores the fact that opening the first box not only pays you off immediately, but also pays you off in the future by giving you information about the other boxes.
This problem easily succumbs to standard expected value calculations if all actions are considered. The steps remain the same as always:
Assign a utility to each dollar amount outcome
Calculate the expected utility of all possible strategies
Choose the strategy with the highest expected utility
In the case of two coins, we were able to trivially calculate the outcomes of all possible strategies, but in larger instances of the problem, it might be advisable to use shortcuts in the calculations. However, it still remains true that the best choice will still be the one you would have gotten if you had done out the full expected value calculation.
I think the confusion arises because a lot of the time problems are presented in a way that screens them off from the rest of the world. For example, you are given a box, and it either has $10.00 or $100.00. Once you open the box, the only effect it has on you is the amount of money you got. After you get the money, the box does not matter to the rest of the world. Problems are presented this way so that it is easy to factor out the decisions and calculations you have to make from every other decision you have to make. However, decision are not necessarily this way (in fact in real life, very few decisions are). In the choice of inserting the first coin or not, this is simply not the case, despite having superficial similarities to standard “box” problems.
Although you clearly understand that the payoffs from the boxes are entangled, you only apply this knowledge in your informal approach to the problem. The failure to consider the full effects of your actions in opening the first box may be psychologically encouraged by the technique of “single probability calculations”, but it is certainly not a failure of the technique itself to capture such situations.
The substantive point here isn’t about EU calculations per se. Running a full analysis of everything that might happen and doing an EU calculation on that basis is fine, and I don’t think the OP disputes this.
The subtlety is about what numerical data can formally represent your full state of knowledge. The claim is that a mere probability of getting the $2 payout does not. It’s the case that on the first use of a box, the probability of the payout given its colour is 0.45 regardless of the colour.
However, if you merely hold onto that probability, then if you put in a coin and so learn something about the boxes you can’t update that probability to figure out what the probability of payout for the second attempt is. You need to go back and also remember whether the box is green or brown. The point of Jaynes and the A_p distribution is that it actually does screen off all other information. If you keep track of it you never need to worry about remembering the colour of the box, or the setup of the experiment. Just this “meta-distribution”.
However, a single probability for each outcome given each strategy is all the information needed. The problem is not with using single probabilities to represent knowledge about the world, it’s the straw math that was used to represent the technique. To me, this reasoning is equivalent to the following:
“You work at a store where management is highly disorganized. Although they precisely track the number of days you have worked since the last payday, they never remember when they last paid you, and thus every day of the work week has a 1⁄5 chance of being a payday. For simplicity’s sake, let’s assume you earn $100 a day.
You wake up on Monday and do the following calculation: If you go in to work, you have a 1⁄5 chance of being paid. Thus the expected payoff of working today is $20, which is too low for it to be worth it. So you skip work. On Tuesday, you make the same calculation, and decide that it’s not worth it to work again, and so you continue forever.
I visit you and immediately point out that you’re being irrational. After all, a salary of $100 a day clearly is worth it to you, yet you are not working. I look at your calculations, and immediately find the problem: You’re using a single probability to represent your expected payoff from working! I tell you that using a meta-probability distribution fixes this problem, and so you excitedly scrap your previous calculations and set about using a meta-probability distribution instead. We decide that a Gaussian sharply peaked at 0.2 best represents our meta-probability distribution, and I send you on your way.”
Of course, in this case, the meta-probability distribution doesn’t change anything. You still continue skipping work, because I have devised the hypothetical situation to illustrate my point (evil laugh). The point is that in this problem the meta-probability distribution solves nothing, because the problem is not with a lack of meta-probability, but rather a lack of considering future consequences.
In both the OPs example and mine, the problem is that the math was done incorrectly, not that you need meta-probabilities. As you said, meta-probabilities are a method of screening off additional labels on your probability distributions for a particular class of problems where you are taking repeated samples that are entangled in a very particular sort of way. As I said above, I appreciate the exposition of meta-probabilities as a tool, and your comment as well has helped me better understand their instrumental nature, but I take issue with what sort of tool they are presented as.
If you do the calculations directly with the probabilities, your calculation will succeed if you do the math right, and fail if you do the math wrong. Meta-probabilities are a particular way of representing a certain calculation that succeed and fail on their own right. If you use them to represent the correct direct probabilities, you will get the right answer, but they are only an aid in the calculation, they never fix any problem with direct probability calculations. The fixing of the calculation and the use of probabilities are orthogonal issues.
To make a blunt analogy, this is like someone trying to plug an Ethernet cable into a phone jack, and then saying “when Ethernet fails, wifi works”, conveniently plugging in the wifi adapter correctly.
The key of the dispute in my eyes is not whether wifi can work for certain situations, but whether there’s anything actually wrong with Ethernet in the first place.
So, my observation is that without meta-distributions (or A_p), or conditioning on a pile of past information (and thus tracking /more/ than just a probability distribution over current outcomes), you don’t have the room in your knowledge to be able to even talk about sensitivity to new information coherently. Once you can talk about a complete state of knowledge, you can begin to talk about the utility of long term strategies.
For example, in your example, one would have the same probability of being paid today if 20% of employers actually pay you every day, whilst 80% of employers never paid you. But in such an environment, it would not make sense to work a second day in 80% of cases. The optimal strategy depends on what you know, and to represent that in general requires more than a straight probability.
There are different problems coming from the distinction between choosing a long term policy to follow, and choosing a one shot action. But we can’t even approach this question in general unless we can talk sensibly about a sufficient set of information to keep track of about. There are two distinct problems, one prior to the other.
Jaynes does discuss a problem which is closer to your concerns (that of estimating neutron multiplication in a 1-d experiment 18.15, pp579. He’s comparing two approaches, which for my purposes differ in their prior A_p distribution.
It may be helpful to read some related posts (linked by lukeprog in a comment on this post): Estimate stability, and Model Stability in Intervention Assessment, which comments on Why We Can’t Take Expected Value Estimates Literally (Even When They’re Unbiased). The first of those motivates the A_p (meta-probability) approach, the second uses it, and the third explains intuitively why it’s important in practice.
Jeremy, I think the apparent disagreement here is due to unclarity about what the point of my argument was. The point was not that this situation can’t be analyzed with decision theory; it certainly can, and I did so. The point is that different decisions have to be made in two situations where the probabilities are the same.
Your discussion seems to equate “probability” with “utility”, and the whole point of the example is that, in this case, they are not the same.
I guess my position is thus:
While there are sets of probabilities which by themselves are not adequate to capture the information about a decision, there always is a set of probabilities which is adequate to capture the information about a decision.
In that sense I do not see your article as an argument against using probabilities to represent decision information, but rather a reminder to use the correct set of probabilities.
My understanding of Chapman’s broader point (which may differ wildly from his understanding) is that determining which set of probabilities is correct for a situation can be rather hard, and so it deserves careful and serious study from people who want to think about the world in terms of probabilities.
Thanks, Jonathan, yes, that’s how I understand it.
Jaynes’ discussion motivates A_p as an efficiency hack that allows you to save memory by forgetting some details. That’s cool, although not the point I’m trying to make here.
A single probability cannot sum up our knowledge.
Before we talk about plans, as you went on to, we must talk about the world as it stands. We know there is a 50% chance of a 0% machine and a 50% chance of a 90% machine. Saying 45% does not encode this information. No other number does either.
Scalar probabilities of binary outcomes are such a useful hammer that we need to stop and remember sometimes that not all uncertainties are nails.
Jeremy, thank you for this. To be clear, I wasn’t suggesting that meta-probability is the solution. It’s a solution. I chose it because I plan to use this framework in later articles, where it will (I hope) be particularly illuminating.
I don’t think it’s correct to equate probability with expected utility, as you seem to do here. The probability of a payout is the same in the two situations. The point of this example is that the probability of a particular event does not determine the optimal strategy. Because utility is dependent on your strategy, that also differs.
Yes, absolutely! I chose a particularly simple problem, in which the correct decision-theoretic analysis is obvious, in order to show that probability does not always determine optimal strategy. In this case, the optimal strategies are clear (except for the exact stopping condition), and clearly different, even though the probabilities are the same.
I’m using this as an introductory wedge example. I’ve opened a Pandora’s Box: probability by itself is not a fully adequate account of rationality. Many odd things will leap and creep out of that box so long as we leave it open.
Hmmm. I was equating them as part of the standard technique of calculating the probability of outcomes from your actions, and then from there multiplying by the utilities of the outcomes and summing to find the expected utility of a given action.
I think it’s just a question of what you think the error is in the original calculation. I find the error to be the conflation of “payout” (as in immediate reward from inserting the coin) with “payout” (as in the expected reward from your action including short term and long-term rewards). It seems to me that you are saying that you can’t look at the immediate probability of payout
which I agree with. But you seem to ignore the obvious solution of considering the probability of total payout, including considerations about your strategy. In that case, you really do have a single probability representing the likelihood of a single outcome, and you do get the correct answer. So I don’t see where the issue with using a single probability comes from. It seems to me an issue with using the wrong single probability.
And especially troubling is that you seem to agree that using direct probabilities to calculate the single probability of each outcome and then weighing them by desirability will give you the correct answer, but then you say
which may be true, but I don’t think is demonstrated at all by this example.
Thank you for further explaining your thinking.
Yes, I see your point (although I don’t altogether agree). But, again, what I’m doing here is setting up analytical apparatus that will be helpful for more difficult cases later.
In the mean time, the LW posts I pointed to here may motivate more strongly the claim that probability alone is an insufficient guide to action.