To give some sense of the problem, if such calibration were possible, then why not calibrate yourself to generate accurate probabilities about the stock market movements and bet on them? It would be an easy and foolproof way to get rich.
I think this statement reflects either an ignorance of finance or the Dark Arts.
First, the stock market is the single worst place to try to test out ideas about probabilities, because so many other people are already trying to predict the market, and so much wealth is at stake. Other people’s predictions will remove most of the potential for arbitrage (reducing ‘signal’), and the insider trading and other forms of cheating generated by the potential for quick wealth will further distort any scientifically detectable trends in the market (increasing ‘noise’). Because investments in the stock market must be made in relatively large quantities to avoid losing your money through trading commissions, a causal theory tester is likely to run out of money long before hitting a good payoff even if he or she is already well-calibrated.
Of course, in real life, people might be moderately-calibrated. The fact that one is capable of making some predictions with some accuracy and precision is not a guarantee that one will be able to reliably and detectably beat even a thin market like a political prediction clearinghouse. Nevertheless, some information is often better than none: I am (rationally) much more concerned about automobile accidents than fires, despite the fact that I know two people who have died in fires and none who have died in automobile accidents. I know this based on my inferences from published statistics, the reliability of which I make further inferences about. I am quite confident (p ~ .95) that it is sensible to drive defensively (at great cost in effort and time) while essentially ignoring fire safety (even though checking a fire extinguisher or smoke detector might take minimal effort.)
I don’t play the stock market, though. I’m not that well calibrated, and probably nobody is without access to inside info of one kind or another.
I think this statement reflects either an ignorance of finance or the Dark Arts.
I’m not an expert on finance, but I am aware of everything you wrote about it in your comment. So I guess this leaves us with the second option. The Dark Arts hypothesis is probably that I’m using the extreme example of the stock market to suggest a general sweeping conclusion that in fact doesn’t hold in less extreme cases.
To which I reply: yes, the stock market is an extreme example, but I honestly can’t think of any other examples that would show otherwise. There are many examples of scientific models that provide more or less accurate probability estimates for all kinds of things, to be sure, but I have yet to hear about people achieving practical success in anything relevant by translating their common-sense feelings of confidence in various beliefs into numerical probabilities.
In my view, calibration of probability estimates can succeed only if (1) you come up with a valid scientific model which you can then use in a shut-up-and-calculate way instead of applying common sense (though you still need it to determine whether the model is applicable in the first place), or (2) you make an essentially identical judgment many times, and from your past performance you extrapolate how frequently the black box inside your head tends to be right.
Now, you try to provide some counterexamples:
I am (rationally) much more concerned about automobile accidents than fires, despite the fact that I know two people who have died in fires and none who have died in automobile accidents. I know this based on my inferences from published statistics, the reliability of which I make further inferences about. I am quite confident (p ~ .95) that it is sensible to drive defensively (at great cost in effort and time) while essentially ignoring fire safety (even though checking a fire extinguisher or smoke detector might take minimal effort.)
Frankly, the only subjective probability estimate I see here is the p~0.95 for your belief about driving. In this case, I’m not getting any more information from this number than if you just described your level of certainty in words, nor do I see any practical application to which you can put this number. I have no objection to your other conclusions, but I see nothing among them that would be controversial to even the most extreme frequentist.
Not sure who voted down your reply; it looks polite and well-reasoned to me.
I believe you when you say that the stock market was honestly intended as representative, although, of course, I continue to disagree about whether it actually is representative.
Here are some more counterexamples:
*When deciding whether to invest in an online bank that pays 1% interest or a local community bank that pays 0.1% interest, I must calculate the odds that each bank will fail before I take my money out; I cannot possibly have a scientific model that generates replicable results for these two banks while also holding down a day job, but numbers will nevertheless help me make a decision that is not driven by an emotional urge to stay with (or leave) an old bank based on customer service considerations that I rationally value as far less than the value of my principal.
*When deciding whether to donate time, money, or neither to a local election campaign, it will help to know which of my donations will have an 10^-6 chance, a 10^-4 chance, and a 10^-2 chance of swinging the election. Numbers are important here because irrational friends and colleagues will urge me to do what ‘feels right’ or to ‘do my part’ without pausing to consider whether this serves any of our goals. If I can generate a replicable scientific model that says whether an extra $500 will win an election, I should stop electioneering and sign up for a job as a tenured political science faculty member, but I nevertheless want to know what the odds are, approximately, in each case, if only so that I can pick which campaign to work on.
As for your objection that:
the only subjective probability estimate I see here is the p~0.95 for your belief about driving. In this case, I’m not getting any more information from this number than if you just described your level of certainty in words,
I suppose I have left a few steps out of my analysis, which I am spelling out in full now:
*Published statistics say that the risk of dying in a fire is 10^-7/people-year and the risk of dying in a car crash is 10^-4/people-year (a report of what is no doubt someone else’s subjective but relatively evidence-based estimate).
*The odds that these statistics are off by more than a factor of 10 relative to each other are less than 10^-1 (a subjective estimate).
*My cost in effort to protect against car crashes is more than 10 times higher than my cost in effort to protect against fires.
*I value the disutility of death-by-fire and death-by-car-crash roughly equally.
*Therefore, there exists a coherent utility function with respect to the relevant states of the world and my relevant strategies such that it is rational for me to protect against car crashes but not fires.
*Therefore, one technique that could be used to show that my behavior is internally incoherent has failed to reject the null hypothesis.
*Therefore, I have some Bayesian evidence that my behavior is rational.
Please let me know if you still think I’m just putting fancy arithmetic labels on what is essentially ‘frequentist’ reasoning, and, if so, exactly what you mean by ‘frequentist.’ I can Wikipedia the standard definition, but it doesn’t quite seem to fit here, imho.
Regarding your examples with banks and donations, when I imagine myself in such situations, I still don’t see how numbers derived from my own common-sense reasoning can be useful. I can see myself making a decision based a simple common-sense impression that one bank looks less shady, or that it’s bigger and thus more likely to be bailed out, etc. Similarly, I could act on a vague impression that one political candidacy I’d favor is far more hopeless than another, and so on. On the other hand, I could also judge from the results of calculations based on numbers from real expert input, like actuary tables for failures of banks of various types, or the poll numbers for elections, etc.
What I cannot imagine, however, is doing anything sensible and useful with probabilities dreamed up from vague common-sense impressions. For example, looking at a bank, getting the impression that it’s reputable and solid, and then saying, “What’s the probability it will fail before time T? Um.. seems really unlikely… let’s say 0.1%.”, and then using this number to calculate my expected returns.
Now, regarding your example with driving vs. fires, suppose I simply say: “Looking at the statistical tables, it is far more likely to be killed by a car accident than a fire. I don’t see any way in which I’m exceptional in my exposure to either, so if I want to make myself safer, it would be stupid to invest more effort in reducing the chance of fire than in more careful driving.” What precisely have you gained with your calculation relative to this plain and clear English statement?
In particular, what is the significance of these subjectively estimated probabilities like p=10^-1 in step 2? What more does this number tell us than a simple statement like “I don’t think it’s likely”? Also, notice that my earlier comment specifically questioned the meaningfulness and practical usefulness of the numerical claim that p~0.95 for this conclusion, and I don’t see how it comes out of your calculation. These seem to be exactly the sorts of dreamed-up probability numbers whose meaningfulness I’m denying.
I think this statement reflects either an ignorance of finance or the Dark Arts.
First, the stock market is the single worst place to try to test out ideas about probabilities, because so many other people are already trying to predict the market, and so much wealth is at stake. Other people’s predictions will remove most of the potential for arbitrage (reducing ‘signal’), and the insider trading and other forms of cheating generated by the potential for quick wealth will further distort any scientifically detectable trends in the market (increasing ‘noise’). Because investments in the stock market must be made in relatively large quantities to avoid losing your money through trading commissions, a causal theory tester is likely to run out of money long before hitting a good payoff even if he or she is already well-calibrated.
Of course, in real life, people might be moderately-calibrated. The fact that one is capable of making some predictions with some accuracy and precision is not a guarantee that one will be able to reliably and detectably beat even a thin market like a political prediction clearinghouse. Nevertheless, some information is often better than none: I am (rationally) much more concerned about automobile accidents than fires, despite the fact that I know two people who have died in fires and none who have died in automobile accidents. I know this based on my inferences from published statistics, the reliability of which I make further inferences about. I am quite confident (p ~ .95) that it is sensible to drive defensively (at great cost in effort and time) while essentially ignoring fire safety (even though checking a fire extinguisher or smoke detector might take minimal effort.)
I don’t play the stock market, though. I’m not that well calibrated, and probably nobody is without access to inside info of one kind or another.
Mass_Driver:
I’m not an expert on finance, but I am aware of everything you wrote about it in your comment. So I guess this leaves us with the second option. The Dark Arts hypothesis is probably that I’m using the extreme example of the stock market to suggest a general sweeping conclusion that in fact doesn’t hold in less extreme cases.
To which I reply: yes, the stock market is an extreme example, but I honestly can’t think of any other examples that would show otherwise. There are many examples of scientific models that provide more or less accurate probability estimates for all kinds of things, to be sure, but I have yet to hear about people achieving practical success in anything relevant by translating their common-sense feelings of confidence in various beliefs into numerical probabilities.
In my view, calibration of probability estimates can succeed only if (1) you come up with a valid scientific model which you can then use in a shut-up-and-calculate way instead of applying common sense (though you still need it to determine whether the model is applicable in the first place), or (2) you make an essentially identical judgment many times, and from your past performance you extrapolate how frequently the black box inside your head tends to be right.
Now, you try to provide some counterexamples:
Frankly, the only subjective probability estimate I see here is the p~0.95 for your belief about driving. In this case, I’m not getting any more information from this number than if you just described your level of certainty in words, nor do I see any practical application to which you can put this number. I have no objection to your other conclusions, but I see nothing among them that would be controversial to even the most extreme frequentist.
Not sure who voted down your reply; it looks polite and well-reasoned to me.
I believe you when you say that the stock market was honestly intended as representative, although, of course, I continue to disagree about whether it actually is representative.
Here are some more counterexamples:
*When deciding whether to invest in an online bank that pays 1% interest or a local community bank that pays 0.1% interest, I must calculate the odds that each bank will fail before I take my money out; I cannot possibly have a scientific model that generates replicable results for these two banks while also holding down a day job, but numbers will nevertheless help me make a decision that is not driven by an emotional urge to stay with (or leave) an old bank based on customer service considerations that I rationally value as far less than the value of my principal.
*When deciding whether to donate time, money, or neither to a local election campaign, it will help to know which of my donations will have an 10^-6 chance, a 10^-4 chance, and a 10^-2 chance of swinging the election. Numbers are important here because irrational friends and colleagues will urge me to do what ‘feels right’ or to ‘do my part’ without pausing to consider whether this serves any of our goals. If I can generate a replicable scientific model that says whether an extra $500 will win an election, I should stop electioneering and sign up for a job as a tenured political science faculty member, but I nevertheless want to know what the odds are, approximately, in each case, if only so that I can pick which campaign to work on.
As for your objection that:
I suppose I have left a few steps out of my analysis, which I am spelling out in full now:
*Published statistics say that the risk of dying in a fire is 10^-7/people-year and the risk of dying in a car crash is 10^-4/people-year (a report of what is no doubt someone else’s subjective but relatively evidence-based estimate).
*The odds that these statistics are off by more than a factor of 10 relative to each other are less than 10^-1 (a subjective estimate).
*My cost in effort to protect against car crashes is more than 10 times higher than my cost in effort to protect against fires.
*I value the disutility of death-by-fire and death-by-car-crash roughly equally.
*Therefore, there exists a coherent utility function with respect to the relevant states of the world and my relevant strategies such that it is rational for me to protect against car crashes but not fires.
*Therefore, one technique that could be used to show that my behavior is internally incoherent has failed to reject the null hypothesis.
*Therefore, I have some Bayesian evidence that my behavior is rational.
Please let me know if you still think I’m just putting fancy arithmetic labels on what is essentially ‘frequentist’ reasoning, and, if so, exactly what you mean by ‘frequentist.’ I can Wikipedia the standard definition, but it doesn’t quite seem to fit here, imho.
Regarding your examples with banks and donations, when I imagine myself in such situations, I still don’t see how numbers derived from my own common-sense reasoning can be useful. I can see myself making a decision based a simple common-sense impression that one bank looks less shady, or that it’s bigger and thus more likely to be bailed out, etc. Similarly, I could act on a vague impression that one political candidacy I’d favor is far more hopeless than another, and so on. On the other hand, I could also judge from the results of calculations based on numbers from real expert input, like actuary tables for failures of banks of various types, or the poll numbers for elections, etc.
What I cannot imagine, however, is doing anything sensible and useful with probabilities dreamed up from vague common-sense impressions. For example, looking at a bank, getting the impression that it’s reputable and solid, and then saying, “What’s the probability it will fail before time T? Um.. seems really unlikely… let’s say 0.1%.”, and then using this number to calculate my expected returns.
Now, regarding your example with driving vs. fires, suppose I simply say: “Looking at the statistical tables, it is far more likely to be killed by a car accident than a fire. I don’t see any way in which I’m exceptional in my exposure to either, so if I want to make myself safer, it would be stupid to invest more effort in reducing the chance of fire than in more careful driving.” What precisely have you gained with your calculation relative to this plain and clear English statement?
In particular, what is the significance of these subjectively estimated probabilities like p=10^-1 in step 2? What more does this number tell us than a simple statement like “I don’t think it’s likely”? Also, notice that my earlier comment specifically questioned the meaningfulness and practical usefulness of the numerical claim that p~0.95 for this conclusion, and I don’t see how it comes out of your calculation. These seem to be exactly the sorts of dreamed-up probability numbers whose meaningfulness I’m denying.