What exactly is the mistake I’m making when I say that I believe such-and-such is true with probability 0.001? Is it that I’m not likely to actually be right 999 times out of 1000 occasions when I say this? If so, then you’re (merely) worried about my calibration, not about the fundamental correspondence between beliefs and probabilities.
It’s not that I’m worried about your poor calibration in some particular instance, but that I believe that accurate calibration in this sense is impossible in practice, except in some very special cases.
(To give some sense of the problem, if such calibration were possible, then why not calibrate yourself to generate accurate probabilities about the stock market movements and bet on them? It would be an easy and foolproof way to get rich. But of course that there is no way you can make your numbers match reality, not in this problem, nor in most other ones.)
Or is it, as you seem now to be suggesting, a question of attire: no one has any business speaking “numerically” unless they’re (metaphorically speaking) “wearing a lab coat”? That is, using numbers is a privilege reserved for scientists who’ve done specific kinds of calculations?
The way you put it, “scientists” sounds too exclusive. Carpenters, accountants, cashiers, etc. also use numbers and numerical calculations in valid ways. However, their use of numbers can ultimately be scrutinized and justified in similar ways as the scientific use of numbers (even if they themselves wouldn’t be up to that task), so with that qualification, my answer would be yes.
(And unfortunately, in practice it’s not at all rare to see people using numbers in ways that are fundamentally unsound, which sometimes gives rise to whole edifices of pseudoscience. I discussed one such example from economics in this thread.)
Now, you say:
It seems to me that the contrast you are positing between “numerical” statements and other indications of degree is illusory. The only difference is that numbers permit an arbitrarily high level of precision; their use doesn’t automatically imply a particular level. Even in the context of scientific calculations, the numbers involved are subject to some particular level of uncertainty. When a scientist makes a calculation to 15 decimal places, they shouldn’t be interpreted as distinguishing between different 20-decimal-digit numbers.
However, when a scientist makes a calculation with 15 digits of precision, or even just one, he must be able to rigorously justify this degree of precision by pointing to observations that are incompatible with the hypothesis that any of these digits, except the last one, is different. (Or in the case of mathematical constants such as pi and e, to proofs of the formulas used to calculate them.) This disclaimer is implicit in any scientific use of numbers. (Assuming valid science is being done, of course.)
And this is where, in my opinion, you construct an invalid analogy:
Likewise, when I make the claim that the probability of Amanda Knox’s guilt is 10^(-3), that should not be interpreted as distinguishing (say) between 0.001 and 0.002. It’s meant to be distinguished from 10^(-2) and (perhaps) 10^(-4). I was explicit about this when I said it was an order-of-magnitude estimate. You may worry that such disclaimers are easily forgotten—but this is to disregard the fact that similar disclaimers always apply whenever numbers are used in any context!
But these disclaimers are not at all the same! The scientist’s—or the carpenter’s, for that matter—implicit disclaimer is: “This number is subject to this uncertainty interval, but there is a rigorous argument why it cannot be outside that range.” On the other hand, your disclaimer is: “This number was devised using an intuitive and arbitrary procedure that doesn’t provide any rigorous argument about the range it must be in.”
And if I may be permitted such a comment, I do see lots of instances here where people seem to forget about this disclaimer, and write as if they believed that they could actually become Bayesian inferers, rather than creatures who depend on capricious black-box circuits inside their heads to make any interesting judgment about anything, and who are (with the present level of technology) largely unable to examine the internal functioning of these boxes and improve them.
Here’s the way I do it: I think approximately in terms of the following “scale” of improbabilities:
I don’t think such usage is unreasonable, but I think it falls under what I call using numbers as vague figures of speech.
To give some sense of the problem, if such calibration were possible, then why not calibrate yourself to generate accurate probabilities about the stock market movements and bet on them? It would be an easy and foolproof way to get rich.
I think this statement reflects either an ignorance of finance or the Dark Arts.
First, the stock market is the single worst place to try to test out ideas about probabilities, because so many other people are already trying to predict the market, and so much wealth is at stake. Other people’s predictions will remove most of the potential for arbitrage (reducing ‘signal’), and the insider trading and other forms of cheating generated by the potential for quick wealth will further distort any scientifically detectable trends in the market (increasing ‘noise’). Because investments in the stock market must be made in relatively large quantities to avoid losing your money through trading commissions, a causal theory tester is likely to run out of money long before hitting a good payoff even if he or she is already well-calibrated.
Of course, in real life, people might be moderately-calibrated. The fact that one is capable of making some predictions with some accuracy and precision is not a guarantee that one will be able to reliably and detectably beat even a thin market like a political prediction clearinghouse. Nevertheless, some information is often better than none: I am (rationally) much more concerned about automobile accidents than fires, despite the fact that I know two people who have died in fires and none who have died in automobile accidents. I know this based on my inferences from published statistics, the reliability of which I make further inferences about. I am quite confident (p ~ .95) that it is sensible to drive defensively (at great cost in effort and time) while essentially ignoring fire safety (even though checking a fire extinguisher or smoke detector might take minimal effort.)
I don’t play the stock market, though. I’m not that well calibrated, and probably nobody is without access to inside info of one kind or another.
I think this statement reflects either an ignorance of finance or the Dark Arts.
I’m not an expert on finance, but I am aware of everything you wrote about it in your comment. So I guess this leaves us with the second option. The Dark Arts hypothesis is probably that I’m using the extreme example of the stock market to suggest a general sweeping conclusion that in fact doesn’t hold in less extreme cases.
To which I reply: yes, the stock market is an extreme example, but I honestly can’t think of any other examples that would show otherwise. There are many examples of scientific models that provide more or less accurate probability estimates for all kinds of things, to be sure, but I have yet to hear about people achieving practical success in anything relevant by translating their common-sense feelings of confidence in various beliefs into numerical probabilities.
In my view, calibration of probability estimates can succeed only if (1) you come up with a valid scientific model which you can then use in a shut-up-and-calculate way instead of applying common sense (though you still need it to determine whether the model is applicable in the first place), or (2) you make an essentially identical judgment many times, and from your past performance you extrapolate how frequently the black box inside your head tends to be right.
Now, you try to provide some counterexamples:
I am (rationally) much more concerned about automobile accidents than fires, despite the fact that I know two people who have died in fires and none who have died in automobile accidents. I know this based on my inferences from published statistics, the reliability of which I make further inferences about. I am quite confident (p ~ .95) that it is sensible to drive defensively (at great cost in effort and time) while essentially ignoring fire safety (even though checking a fire extinguisher or smoke detector might take minimal effort.)
Frankly, the only subjective probability estimate I see here is the p~0.95 for your belief about driving. In this case, I’m not getting any more information from this number than if you just described your level of certainty in words, nor do I see any practical application to which you can put this number. I have no objection to your other conclusions, but I see nothing among them that would be controversial to even the most extreme frequentist.
Not sure who voted down your reply; it looks polite and well-reasoned to me.
I believe you when you say that the stock market was honestly intended as representative, although, of course, I continue to disagree about whether it actually is representative.
Here are some more counterexamples:
*When deciding whether to invest in an online bank that pays 1% interest or a local community bank that pays 0.1% interest, I must calculate the odds that each bank will fail before I take my money out; I cannot possibly have a scientific model that generates replicable results for these two banks while also holding down a day job, but numbers will nevertheless help me make a decision that is not driven by an emotional urge to stay with (or leave) an old bank based on customer service considerations that I rationally value as far less than the value of my principal.
*When deciding whether to donate time, money, or neither to a local election campaign, it will help to know which of my donations will have an 10^-6 chance, a 10^-4 chance, and a 10^-2 chance of swinging the election. Numbers are important here because irrational friends and colleagues will urge me to do what ‘feels right’ or to ‘do my part’ without pausing to consider whether this serves any of our goals. If I can generate a replicable scientific model that says whether an extra $500 will win an election, I should stop electioneering and sign up for a job as a tenured political science faculty member, but I nevertheless want to know what the odds are, approximately, in each case, if only so that I can pick which campaign to work on.
As for your objection that:
the only subjective probability estimate I see here is the p~0.95 for your belief about driving. In this case, I’m not getting any more information from this number than if you just described your level of certainty in words,
I suppose I have left a few steps out of my analysis, which I am spelling out in full now:
*Published statistics say that the risk of dying in a fire is 10^-7/people-year and the risk of dying in a car crash is 10^-4/people-year (a report of what is no doubt someone else’s subjective but relatively evidence-based estimate).
*The odds that these statistics are off by more than a factor of 10 relative to each other are less than 10^-1 (a subjective estimate).
*My cost in effort to protect against car crashes is more than 10 times higher than my cost in effort to protect against fires.
*I value the disutility of death-by-fire and death-by-car-crash roughly equally.
*Therefore, there exists a coherent utility function with respect to the relevant states of the world and my relevant strategies such that it is rational for me to protect against car crashes but not fires.
*Therefore, one technique that could be used to show that my behavior is internally incoherent has failed to reject the null hypothesis.
*Therefore, I have some Bayesian evidence that my behavior is rational.
Please let me know if you still think I’m just putting fancy arithmetic labels on what is essentially ‘frequentist’ reasoning, and, if so, exactly what you mean by ‘frequentist.’ I can Wikipedia the standard definition, but it doesn’t quite seem to fit here, imho.
Regarding your examples with banks and donations, when I imagine myself in such situations, I still don’t see how numbers derived from my own common-sense reasoning can be useful. I can see myself making a decision based a simple common-sense impression that one bank looks less shady, or that it’s bigger and thus more likely to be bailed out, etc. Similarly, I could act on a vague impression that one political candidacy I’d favor is far more hopeless than another, and so on. On the other hand, I could also judge from the results of calculations based on numbers from real expert input, like actuary tables for failures of banks of various types, or the poll numbers for elections, etc.
What I cannot imagine, however, is doing anything sensible and useful with probabilities dreamed up from vague common-sense impressions. For example, looking at a bank, getting the impression that it’s reputable and solid, and then saying, “What’s the probability it will fail before time T? Um.. seems really unlikely… let’s say 0.1%.”, and then using this number to calculate my expected returns.
Now, regarding your example with driving vs. fires, suppose I simply say: “Looking at the statistical tables, it is far more likely to be killed by a car accident than a fire. I don’t see any way in which I’m exceptional in my exposure to either, so if I want to make myself safer, it would be stupid to invest more effort in reducing the chance of fire than in more careful driving.” What precisely have you gained with your calculation relative to this plain and clear English statement?
In particular, what is the significance of these subjectively estimated probabilities like p=10^-1 in step 2? What more does this number tell us than a simple statement like “I don’t think it’s likely”? Also, notice that my earlier comment specifically questioned the meaningfulness and practical usefulness of the numerical claim that p~0.95 for this conclusion, and I don’t see how it comes out of your calculation. These seem to be exactly the sorts of dreamed-up probability numbers whose meaningfulness I’m denying.
komponisto:
It’s not that I’m worried about your poor calibration in some particular instance, but that I believe that accurate calibration in this sense is impossible in practice, except in some very special cases.
(To give some sense of the problem, if such calibration were possible, then why not calibrate yourself to generate accurate probabilities about the stock market movements and bet on them? It would be an easy and foolproof way to get rich. But of course that there is no way you can make your numbers match reality, not in this problem, nor in most other ones.)
The way you put it, “scientists” sounds too exclusive. Carpenters, accountants, cashiers, etc. also use numbers and numerical calculations in valid ways. However, their use of numbers can ultimately be scrutinized and justified in similar ways as the scientific use of numbers (even if they themselves wouldn’t be up to that task), so with that qualification, my answer would be yes.
(And unfortunately, in practice it’s not at all rare to see people using numbers in ways that are fundamentally unsound, which sometimes gives rise to whole edifices of pseudoscience. I discussed one such example from economics in this thread.)
Now, you say:
However, when a scientist makes a calculation with 15 digits of precision, or even just one, he must be able to rigorously justify this degree of precision by pointing to observations that are incompatible with the hypothesis that any of these digits, except the last one, is different. (Or in the case of mathematical constants such as pi and e, to proofs of the formulas used to calculate them.) This disclaimer is implicit in any scientific use of numbers. (Assuming valid science is being done, of course.)
And this is where, in my opinion, you construct an invalid analogy:
But these disclaimers are not at all the same! The scientist’s—or the carpenter’s, for that matter—implicit disclaimer is: “This number is subject to this uncertainty interval, but there is a rigorous argument why it cannot be outside that range.” On the other hand, your disclaimer is: “This number was devised using an intuitive and arbitrary procedure that doesn’t provide any rigorous argument about the range it must be in.”
And if I may be permitted such a comment, I do see lots of instances here where people seem to forget about this disclaimer, and write as if they believed that they could actually become Bayesian inferers, rather than creatures who depend on capricious black-box circuits inside their heads to make any interesting judgment about anything, and who are (with the present level of technology) largely unable to examine the internal functioning of these boxes and improve them.
I don’t think such usage is unreasonable, but I think it falls under what I call using numbers as vague figures of speech.
I think this statement reflects either an ignorance of finance or the Dark Arts.
First, the stock market is the single worst place to try to test out ideas about probabilities, because so many other people are already trying to predict the market, and so much wealth is at stake. Other people’s predictions will remove most of the potential for arbitrage (reducing ‘signal’), and the insider trading and other forms of cheating generated by the potential for quick wealth will further distort any scientifically detectable trends in the market (increasing ‘noise’). Because investments in the stock market must be made in relatively large quantities to avoid losing your money through trading commissions, a causal theory tester is likely to run out of money long before hitting a good payoff even if he or she is already well-calibrated.
Of course, in real life, people might be moderately-calibrated. The fact that one is capable of making some predictions with some accuracy and precision is not a guarantee that one will be able to reliably and detectably beat even a thin market like a political prediction clearinghouse. Nevertheless, some information is often better than none: I am (rationally) much more concerned about automobile accidents than fires, despite the fact that I know two people who have died in fires and none who have died in automobile accidents. I know this based on my inferences from published statistics, the reliability of which I make further inferences about. I am quite confident (p ~ .95) that it is sensible to drive defensively (at great cost in effort and time) while essentially ignoring fire safety (even though checking a fire extinguisher or smoke detector might take minimal effort.)
I don’t play the stock market, though. I’m not that well calibrated, and probably nobody is without access to inside info of one kind or another.
Mass_Driver:
I’m not an expert on finance, but I am aware of everything you wrote about it in your comment. So I guess this leaves us with the second option. The Dark Arts hypothesis is probably that I’m using the extreme example of the stock market to suggest a general sweeping conclusion that in fact doesn’t hold in less extreme cases.
To which I reply: yes, the stock market is an extreme example, but I honestly can’t think of any other examples that would show otherwise. There are many examples of scientific models that provide more or less accurate probability estimates for all kinds of things, to be sure, but I have yet to hear about people achieving practical success in anything relevant by translating their common-sense feelings of confidence in various beliefs into numerical probabilities.
In my view, calibration of probability estimates can succeed only if (1) you come up with a valid scientific model which you can then use in a shut-up-and-calculate way instead of applying common sense (though you still need it to determine whether the model is applicable in the first place), or (2) you make an essentially identical judgment many times, and from your past performance you extrapolate how frequently the black box inside your head tends to be right.
Now, you try to provide some counterexamples:
Frankly, the only subjective probability estimate I see here is the p~0.95 for your belief about driving. In this case, I’m not getting any more information from this number than if you just described your level of certainty in words, nor do I see any practical application to which you can put this number. I have no objection to your other conclusions, but I see nothing among them that would be controversial to even the most extreme frequentist.
Not sure who voted down your reply; it looks polite and well-reasoned to me.
I believe you when you say that the stock market was honestly intended as representative, although, of course, I continue to disagree about whether it actually is representative.
Here are some more counterexamples:
*When deciding whether to invest in an online bank that pays 1% interest or a local community bank that pays 0.1% interest, I must calculate the odds that each bank will fail before I take my money out; I cannot possibly have a scientific model that generates replicable results for these two banks while also holding down a day job, but numbers will nevertheless help me make a decision that is not driven by an emotional urge to stay with (or leave) an old bank based on customer service considerations that I rationally value as far less than the value of my principal.
*When deciding whether to donate time, money, or neither to a local election campaign, it will help to know which of my donations will have an 10^-6 chance, a 10^-4 chance, and a 10^-2 chance of swinging the election. Numbers are important here because irrational friends and colleagues will urge me to do what ‘feels right’ or to ‘do my part’ without pausing to consider whether this serves any of our goals. If I can generate a replicable scientific model that says whether an extra $500 will win an election, I should stop electioneering and sign up for a job as a tenured political science faculty member, but I nevertheless want to know what the odds are, approximately, in each case, if only so that I can pick which campaign to work on.
As for your objection that:
I suppose I have left a few steps out of my analysis, which I am spelling out in full now:
*Published statistics say that the risk of dying in a fire is 10^-7/people-year and the risk of dying in a car crash is 10^-4/people-year (a report of what is no doubt someone else’s subjective but relatively evidence-based estimate).
*The odds that these statistics are off by more than a factor of 10 relative to each other are less than 10^-1 (a subjective estimate).
*My cost in effort to protect against car crashes is more than 10 times higher than my cost in effort to protect against fires.
*I value the disutility of death-by-fire and death-by-car-crash roughly equally.
*Therefore, there exists a coherent utility function with respect to the relevant states of the world and my relevant strategies such that it is rational for me to protect against car crashes but not fires.
*Therefore, one technique that could be used to show that my behavior is internally incoherent has failed to reject the null hypothesis.
*Therefore, I have some Bayesian evidence that my behavior is rational.
Please let me know if you still think I’m just putting fancy arithmetic labels on what is essentially ‘frequentist’ reasoning, and, if so, exactly what you mean by ‘frequentist.’ I can Wikipedia the standard definition, but it doesn’t quite seem to fit here, imho.
Regarding your examples with banks and donations, when I imagine myself in such situations, I still don’t see how numbers derived from my own common-sense reasoning can be useful. I can see myself making a decision based a simple common-sense impression that one bank looks less shady, or that it’s bigger and thus more likely to be bailed out, etc. Similarly, I could act on a vague impression that one political candidacy I’d favor is far more hopeless than another, and so on. On the other hand, I could also judge from the results of calculations based on numbers from real expert input, like actuary tables for failures of banks of various types, or the poll numbers for elections, etc.
What I cannot imagine, however, is doing anything sensible and useful with probabilities dreamed up from vague common-sense impressions. For example, looking at a bank, getting the impression that it’s reputable and solid, and then saying, “What’s the probability it will fail before time T? Um.. seems really unlikely… let’s say 0.1%.”, and then using this number to calculate my expected returns.
Now, regarding your example with driving vs. fires, suppose I simply say: “Looking at the statistical tables, it is far more likely to be killed by a car accident than a fire. I don’t see any way in which I’m exceptional in my exposure to either, so if I want to make myself safer, it would be stupid to invest more effort in reducing the chance of fire than in more careful driving.” What precisely have you gained with your calculation relative to this plain and clear English statement?
In particular, what is the significance of these subjectively estimated probabilities like p=10^-1 in step 2? What more does this number tell us than a simple statement like “I don’t think it’s likely”? Also, notice that my earlier comment specifically questioned the meaningfulness and practical usefulness of the numerical claim that p~0.95 for this conclusion, and I don’t see how it comes out of your calculation. These seem to be exactly the sorts of dreamed-up probability numbers whose meaningfulness I’m denying.