Although lots of people here consider it a hallmark of “rationality,” assigning numerical probabilities to common-sense conclusions and beliefs is meaningless, except perhaps as a vague figure of speech. (Absolutely certain.)
assigning numerical probabilities to common-sense conclusions and beliefs is meaningless
It is risky to deprecate something as “meaningless”—a ritual, a practice, a word, an idiom. Risky because the actual meaning may be something very different than you imagine. That seems to be the case here with attaching numbers to subjective probabilities.
The meaning of attaching a number to something lies in how that number may be used to generate a second number that can then be attached to something else. There is no point in providing a number to associate with the variable ‘m’ (i.e. that number is meaningless) unless you simultaneously provide a number to associate with the variable ‘f’ and then plug both into “f=ma” to generate a third number to associate with the variable ‘a’, an number which you can test empirically.
Similarly, a single isolated subjective probability estimate may seem somewhat meaningless in isolation, but if you place it into a context with enough related subjective probability estimates and empirically measured frequencies, then all those
probabilities and frequencies can be combined and compared using the standard formulas of Bayesian probability:
P(~A) = 1 - P(A)
P(B|A)*P(A)=P(A&B)=P(A|B)*P(B)
So, if you want to deprecate as “meaningless” my estimate that the Democrats have a 40% chance to maintain their House majority in the next election, go ahead. But you cannot then also deprecate my estimate that the Republicans have a 70% of reaching a House majority. Because the conjunction of those two probability estimates is not meaningless. It is quite respectably false.
I think you’re not drawing a clear enough distinction between two different things, namely the mathematical relationships between numbers, and the correspondence between numbers and reality.
If you ask an astronomer what is the mass of some asteroid, he will presumably give you a number with a few significant digits and and uncertainty interval. If you ask him to justify this number, he will be able to point to some observations that are incompatible with the assumption that the mass is outside this interval, which follows from a mathematical argument based on our best knowledge of physics. If you ask for more significant digits, he will say that we don’t know (and that beyond a certain accuracy, the question doesn’t even make sense, since it’s constantly losing and gathering small bits of mass). That’s what it means for a number to be rigorously justified.
But now imagine that I make an uneducated guess of how heavy this asteroid might be, based on no actual astronomical observation. I do of course know that it must be heavier than a few tons or otherwise it wouldn’t be noticeable from Earth as an identifiable object, and that it must be lighter than 10^20 or so tons since that’s roughly the range where smaller planets are, but it’s clearly nonsensical for me to express that guess with even one digit of precision. Yet I could insist on a precise guess, and claim that it’s “meaningful” in a way analogous to your above justification of subjective probability estimates, by deriving various mathematical and physical implications of this fact. If you deprecate my claim that its mass is 4.5237 x 10^15kg, then you cannot also deprecate my claim that it is a sphere of radius 1km and average density 1000kg/m^3, since the conjunction of these claims is by the sheer force of mathematics false.
Therefore, I don’t see how you can argue that a number is meaningful by merely noting its relationships with other numbers that follow from pure mathematics. Or am I missing something with this analogy?
I don’t see how you can argue that a number is meaningful by merely noting its relationships with other numbers that follow from pure mathematics. Or am I missing something with this analogy?
The only thing you are missing is the first paragraph of my reply. Just because something doesn’t have the kind of meaning you think it ought to have (by virtue of being a number, for example) that doesn’t justify your claim that it is meaningless.
Subjective probabilities of isolated propositions don’t have the kind of meaning you want numbers to have. But they have exactly the kind of meaning I want them to have—specifically they can be used in computations that produce consistent results.
Do you think that the digits of pi beyond the first half dozen are also meaningless?
Subjective probabilities of isolated propositions don’t have the kind of meaning you want numbers to have. But they have exactly the kind of meaning I want them to have—specifically they can be used in computations that produce consistent results.
Fair enough, but I still don’t see how this solves the problem of the correspondence between numbers and reality. Any number can be used in computations that produce consistent results if you just start plugging it into formulas derived from some consistent mathematical theory. It is when the numbers are used as basis for claims about the real, physical world that I insist on an explanation of how exactly they are derived and how their claimed correspondence with reality is justified.
Do you think that the digits of pi beyond the first half dozen are also meaningless?
The digits of pi are an artifact of pure mathematics, so I don’t think it’s a good analogy for what we’re talking about. Once you’ve built up enough mathematics to define lengths of curves in Euclidean geometry, the ratio between the circumference and diameter of a circle follows by pure logic. Any suitable analogy for what we’re talking about must encompass empirical knowledge, and claims which can be falsified by empirical observations.
Subjective probabilities of isolated propositions don’t have the kind of meaning you want numbers to have. But they have exactly the kind of meaning I want them to have—specifically they can be used in computations that produce consistent results.
Fair enough, but I still don’t see how this solves the problem of the correspondence between numbers and reality.
It doesn’t have to. That is a problem you made up. Other people don’t have to buy in to your view on the proper relationship between numbers and physical reality.
My viewpoint on numbers is somewhere between platonism and formalism. I think that the meaning of a number is a particular structure in my mind. If I have an axiom system that is categorical (and, of course, usually I don’t) then that picture in my mind can be made inter-subjective in that someone who also accepts those axioms can build an isomorphic structure in their own mind. The real world has absolutely nothing to do with Tarski’s semantics—which is where I look to find out what the “meaning” of a number is.
Your complaint that subjective probabilities have no meaning is very much like the complaint of a new convert to atheism who laments that without God, life has no meaning. My advice: stop telling other people what the word “meaning” should mean.
However, if you really need some kind of affirmation, then I will provide some. I agree with you that the numbers used in subjective probabilities are less, … what is the right word, … less empirical than are the numbers you usually find in science classes. Does that make you feel better?
It doesn’t have to. That is a problem you made up. Other people don’t have to buy in to your view on the proper relationship between numbers and physical reality.
You probably wouldn’t buy that same argument if it came from a numerologist, though. I don’t think I hold any unusual and exotic views on this relationship, and in fact, I don’t think I have made any philosophical assumptions in this discussion beyond the basic common-sense observation that if you want to use numbers to talk about the real world, they should have a clear connection with something that can be measured or counted to make any sense. I don’t see any relevance of these (otherwise highly interesting) deep questions of the philosophy of math for any of my arguments.
There is nothing philosophically wrong with your position except your choice of the word “meaningless” as an epithet for the use of numbers which cannot be empirically justified. Your choice of that word is pretty much the only reason I am disagreeing with you.
Given your position on the meaninglessness of assigning a numerical probability value to a vague feeling of how likely something is, how would you decide whether you were being offered good odds if offered a bet? If you’re not in the habit of accepting bets, how do you think someone who does this for a living (a bookie for example) should go about deciding on what odds to assign to a given bet?
Given your position on the meaninglessness of assigning a numerical probability value to a vague feeling of how likely something is, how would you decide whether you were being offered good odds if offered a bet?
In reality, it is rational to bet only with people over whom you have superior relevant knowledge, or with someone who is suffering from an evident failure of common sense. Otherwise, betting is just gambling (which of course can be worthwhile for fun or signaling value). Look at the stock market: it’s pure gambling, unless you have insider knowledge or vastly higher expertise than the average investor.
This is the basic reason why I consider the emphasis on subjective Bayesian probabilities that is so popular here misguided. In technical problems where probability calculations can be helpful, the experts in the field already know how to use them. On the other hand, for the great majority of the relevant beliefs and conclusions you’ll form in life, they offer nothing useful beyond what your vague common sense is already telling you. If you start taking them too seriously, it’s easy to start fooling yourself that your thinking is more accurate and precise than it really is, and if you start actually betting on them, you’ll be just gambling.
If you’re not in the habit of accepting bets, how do you think someone who does this for a living (a bookie for example) should go about deciding on what odds to assign to a given bet?
I’m not familiar with the details of this business, but from what I understand, bookmakers work in such a way that they’re guaranteed to make a profit no matter what happens. Effectively, they exploit the inconsistencies between different people’s estimates of what the favorable odds are. (If there are bookmakers who stake their profit on some particular outcome, then I’m sure that they have insider knowledge if they can stay profitable.) Now of course, the trick is to come up with a book that is both profitable and offers odds that will sell well, but here we get into the fuzzy art of exploiting people’s biases for profit.
In reality, it is rational to bet only with people over whom you have superior relevant knowledge, or with someone who is suffering from an evident failure of common sense.
You still have to be able to translate your superior relevant knowledge into odds in order to set the terms of the bet however. Do you not believe that this is an ability that people have varying degrees of aptitude for?
Look at the stock market: it’s pure gambling, unless you have insider knowledge or vastly higher expertise than the average investor.
Vastly higher expertise than the average investor would appear to include something like the ability in question—translating your beliefs about the future into a probability such that you can judge whether investments have positive expected value. If you accept that true alpha) exists (and the evidence suggests that though rare a small percentage of the best investors do appear to have positive alpha) then what process do you believe those who possess it use to decide which investments are good and which bad?
What’s your opinion on prediction markets? They seem to produce fairly good probability estimates so presumably the participants must be using some better-than-random process for arriving at numerical probability estimates for their predictions.
I’m not familiar with the details of this business, but from what I understand, bookmakers work in such a way that they’re guaranteed to make a profit no matter what happens.
They certainly aim for a balanced book but they wouldn’t be very profitable if they were not reasonably competent at setting initial odds (and updating them in the light of new information). If the initial odds are wildly out of line with their customers’ then they won’t be able to make a balanced book.
You still have to be able to translate your superior relevant knowledge into odds in order to set the terms of the bet however. Do you not believe that this is an ability that people have varying degrees of aptitude for?
They sure do, but in all the examples I can think of, people either just follow their intuition directly when faced with a concrete situation, or employ rigorous science to attack the problem. (It doesn’t have to be the official accredited science, of course; the Venn diagram of official science and valid science features only a partial overlap.) I just don’t see any practical examples of people successfully betting by doing calculations with probability numbers derived from their intuitive feelings of confidence that would go beyond what a mere verbal expression of these feelings would convey. Can you think of any?
If you accept that true alpha exists (and the evidence suggests that though rare a small percentage of the best investors do appear to have positive alpha) then what process do you believe those who possess it use to decide which investments are good and which bad?
Well, if I knew, I would be doing it myself—and I sure wouldn’t be talking about it publicly!
The problem with discussing investment strategies is that any non-trivial public information about this topic necessarily has to be bullshit, or at least drowned in bullshit to the point of being irrecoverable, since exclusive possession of correct information is a sure path to getting rich, but its effectiveness critically depends on exclusivity. Still, I would be surprised to find out that the success of some alpha-achieving investors is based on taking numerical expressions of common-sense confidence seriously.
In a sense, a similar problem faces anyone who aspires to be more “rational” than the average folk in any meaningful sense. Either your “rationality” manifests itself only in irrelevant matters, or you have to ask yourself what is so special and exclusive about you that you’re reaping practical success that eludes so many other people, and in such a way that they can’t just copy your approach.
What’s your opinion on prediction markets? They seem to produce fairly good probability estimates so presumably the participants must be using some better-than-random process for arriving at numerical probability estimates for their predictions.
I agree with this assessment, but the accuracy of information aggregated by a prediction market implies nothing about your own individual certainty. Prediction markets work by cancelling out random errors and enabling specialists who wield esoteric expertise to take advantage of amateurs’ systematic biases. Where your own individual judgment falls within this picture, you cannot know, unless you’re one of these people with esoteric expertise.
I just don’t see any practical examples of people successfully betting by doing calculations with probability numbers derived from their intuitive feelings of confidence that would go beyond what a mere verbal expression of these feelings would convey. Can you think of any?
I’d speculate that bookies and professional sports bettors are doing something like this. By bookies here I mean primarily the kind of individuals who stand with a chalkboard at race tracks rather than the large companies. They probably use some semi-rigorous / scientific techniques to analyze past form and then mix it with a lot of intuition / expertise together with lots of detailed domain specific knowledge and ‘insider’ info (a particular horse or jockey has recently recovered from an illness or injury and so may perform worse than expected, etc.). They’ll then integrate all of this information together using some non mathematically rigorous opaque mental process and derive a probability estimate which will determine what odds they are willing to offer or accept.
I’ve read a fair bit of material by professional investors and macro hedge fund managers describing their thinking and how they make investment decisions. I think they are often doing something similar. Integrating information derived from rigorous analysis with more fuzzy / intuitive reasoning based on expertise, knowledge and experience and using it to derive probabilities for particular outcomes. They then seek out investments that currently appear to be mis-priced relative to the probabilities they’ve estimated, ideally with a fairly large margin of safety to allow for the imprecise and uncertain nature of their estimates.
It’s entirely possible that this is not what’s going on at all but it appears to me that something like this is a factor in the success of anyone who consistently profits from dealing with risk and uncertainty.
The problem with discussing investment strategies is that any non-trivial public information about this topic necessarily has to be bullshit, or at least drowned in bullshit to the point of being irrecoverable, since exclusive possession of correct information is a sure path to getting rich, but its effectiveness critically depends on exclusivity.
My experience leads me to believe that this is not entirely accurate. Investors are understandably reluctant to share very specific time critical investment ideas for free but they frequently share their thought processes for free and talk in general terms about their approaches and my impression is that they are no more obfuscatory or deliberately misleading than anyone else who talks about their success in any field.
In addition, hedge fund investor letters often share quite specific details of reasoning after the fact once profitable trades have been closed and these kinds of details are commonly elaborated in books and interviews once time-sensitive information has lost most of its value.
Either your “rationality” manifests itself only in irrelevant matters, or you have to ask yourself what is so special and exclusive about you that you’re reaping practical success that eludes so many other people, and in such a way that they can’t just copy your approach.
This seems to be taking the ethos of the EMH a little far. I comfortably attribute a significant portion of my academic and career success to being more intelligent and a clearer thinker than most people. Anyone here who through a sense of false modesty believes otherwise is probably deluding themselves.
Where your own individual judgment falls within this picture, you cannot know, unless you’re one of these people with esoteric expertise.
This seems to be the main point of ongoing calibration exercises. If you have a track record of well calibrated predictions then you can gain some confidence that your own individual judgement is sound.
Overall I don’t think we have a massive disagreement here. I agree with most of your reservations and I’m by no means certain that improving one’s own calibration is feasible but I suspect that it might be and it seems sufficiently instrumentally useful that I’m interested in trying to improve my own.
I’d speculate that bookies and professional sports bettors are doing something like this. [...] I’ve read a fair bit of material by professional investors and macro hedge fund managers describing their thinking and how they make investment decisions. I think they are often doing something similar.
Your knowledge about these trades seems to be much greater than mine, so I’ll accept these examples. In the meantime, I have expounded my whole view of the topic in a reply to an excellent systematic list of questions posed by prase, and in those terms, this would indicate the existence of what I called the third type of exceptions under point (3). I still maintain that these are rare exceptions in the overall range of human judgments, though, and that my basic point holds for the overwhelming majority of human common-sense thinking.
Investors are understandably reluctant to share very specific time critical investment ideas for free but they frequently share their thought processes for free and talk in general terms about their approaches and my impression is that they are no more obfuscatory or deliberately misleading than anyone else who talks about their success in any field.
I don’t think they’re being deliberately misleading. I just think that the whole mechanism by which the public discourse on these topics comes into being inherently generates a nearly impenetrable confusion, which you can dispel to extract useful information only if you are already an expert in the first place. There are many specific reasons for this, but it all ultimately comes down to the stability of the weak EMH equilibrium.
This seems to be taking the ethos of the EMH a little far. I comfortably attribute a significant portion of my academic and career success to being more intelligent and a clearer thinker than most people. Anyone here who through a sense of false modesty believes otherwise is probably deluding themselves.
Oh, absolutely! But you’re presumably estimating the rank of your abilities based on some significant accomplishments that most people would indeed find impossible to achieve. What I meant to say (even though I expressed it poorly) is that there is no easy and readily available way to excel at “rationality” in any really relevant matters. This in contrast to the attitude, sometimes seen among the people here, that you can learn about Bayesianism or whatever else and just by virtue of that set yourself apart from the masses in accuracy of thought. The EMH ethos is, in my opinion, a good intellectual antidote against such temptations of hubris.
Given your position on the meaninglessness of assigning a numerical probability value to a vague feeling of how likely something is, how would you decide whether you were being offered good odds if offered a bet?
In reality, it is rational to bet only with people over whom you have superior relevant knowledge, or with someone who is suffering from an evident failure of common sense
You’re dodging the question. What if the odds arose from a natural process, so that there isn’t a person on the other side of the bet to compare your state of knowledge against?
I think this is right. The idea that you would be betting against another person is inessential, an unfortunate distraction arising from the choice of thought experiment. Admittedly it’s a natural way to understand the thought experiment, but it’s inessential. The experiment could be revised to exlude it. In fact every moment we make decisions whose outcomes depend on things we don’t know, and in making those decisions we are therefore in effect gambling. We are surrounded by risks, and our decisions reveal our assessment of those risks.
You’re dodging the question. What if the odds arose from a natural process, so that there isn’t a
person on the other side of the bet to compare your state of knowledge against?
Maybe it’s my failure of English comprehension (I’m not a native speaker, as you might guess from my frequent grammatical errors), but when I read the phrase “being offered good odds if offered a bet,” I understood it as asking about a bet with opponents who stand to lose if my guess is right. So, honestly, I wasn’t dodging the question.
But to answer your question, it depends on the concrete case. Some natural processes can be approximated with models that yield useful probability estimates, and faced with some such process, I would of course try to use the best scientific knowledge available to calculate the odds if the stakes are high enough to justify the effort. When this is not possible, however, the only honest answer is that my decision would be guided by whatever intuitive feeling my brain happens to produce after some common-sense consideration, and unless this intuitive feeling told me that losing the bet is extremely unlikely, I would refuse to bet. And I honestly cannot think of a situation where translating this intuitive feeling of certainty into numbers would increase the clarity and accuracy of my thinking, or provide for any useful practical guidelines.
For example, if I come across a ditch and decide to jump over to save the effort of walking around to cross over a bridge, I’m effectively betting that it’s narrow enough to jump over safely. In reality, I’ll feel intuitively either that it’s safe to jump or not, and I’ll act on that feeling, produced by some opaque module for physics calculations in my brain. Of course, my conclusion might be wrong, and as a kid I would occasionally injure myself by judging wrongly in such situations, but how can I possibly quantify this feeling of certainty numerically in a meaningful way? It simply makes no sense. The overwhelming majority of real-life cases where I have to produce some judgment, and perhaps even bet on it, are of this sort.
It would be cool to have a brain that produces confidence estimates for its conclusions with greater precision, but mine simply isn’t like that, and it’s useless to pretend that it is.
When this is not possible, however, the only honest answer is that my decision would be guided by whatever intuitive feeling my brain happens to produce after some common-sense consideration, and unless this intuitive feeling told me that losing the bet is extremely unlikely, I would refuse to bet.
Applying the view of probability as willingness to bet, you can’t refuse to reveal your probability assignments. Life continually throws at us risky choices. You can perform risky action X with high-value success Y and high-cost failure Z or you can refuse to perform it, but both actions reveal something about your probability assignments. If you perform the risky action X, it reveals that you assign sufficiently high probability to Y (i.e. low to Z) given the values that you place on Y and Z. If you refuse to perform risky action X, it reveals that you assign sufficiently low probability to Y given the values you place on Y and Z. This is nothing other than your willingness to bet.
In an actual case, your simple yes/no response to a given choice is not enough to reveal your probability assignment and only reveals some information about it (that it is below or above a certain value). But counterfactually, we can imagine infinite variations on the choice you are presented with, and for each of these choices, there is a response which (counterfactually) you would have given. This set of responses manifests your probability assignment (and reveals also its degree of precision). Of course, in real life, we can’t usually conduct an experiment that reveals a substantial portion of this set of counterfactuals, so in real life, we remain in the dark about your probability assignment (unless we find some clever alternative way to elicit it than the direct, brute force test-all-variations approach I have just described). But the counterfactuals are still there, and still define a probability assignment, even if we don’t know what it is.
And I honestly cannot think of a situation where translating this intuitive feeling of certainty into numbers would increase the clarity and accuracy of my thinking, or provide for any useful practical guidelines.
But this revealed probability assignment is parallel to revealed preference. The point of revealed preference is not to help the consumer make better choices. It is a conceptual and sometimes practical tool of economics. The economist studying people discovers their preferences by observing their purchases. And similarly, we can discover a person’s probability assignments by observing his choices. The purpose need not be to help that person to increase the clarity or accuracy of his own thinking, any more than the purpose of revealed preference is to help the consumer shop.
A person interested in self-knowledge, for whatever reason, might want to observe his own behavior in order to discover his own preferences. I think that people like Roissy in DC may be able to teach women about themselves if they choose to read him, teach them about what they really want in a man by pointing out what their behavior is, pointing out that they pursue certain kinds of men and shun others. Women—along with everybody else—are apparently suffering from many delusions about what they want, thinking they want one thing, but actually wanting another—as revealed by their behavior. This self-knowledge may or may not be helpful, but surely at least some women would be interested in it.
For example, if I come across a ditch and decide to jump over to save the effort of walking around to cross over a bridge, I’m effectively betting that it’s narrow enough to jump over safely.
But as a matter of fact your choice is influenced by several factors, including the reward of successfully jumping over the ditch (i.e. the reduction in walking time) and the cost of attempting the jump and failing, along with the width of the gap. As these factors are (counterfactually) varied, a possibly precise picture of your probability assignment may emerge. That is, it may turn out that you are willing to risk the jump if failure would only sprain an ankle, but unwilling to risk the jump if failure is certain death. This would narrow down the probability of success that you have assigned to the jump—it would be probable enough to be worth risking the sprained ankle, but not probable enough to be worth risking certain death. This probability assignment is not necessarily anything that you have immediately available to your conscious awareness, but in principle it can be elicited through experimentation with variations on the scenario.
Are you asking for a defense of the statement, or do you agree with it and are merely commenting on the way I expressed it?
I’ll give a defense by means of an example. At Wikipedia they give the following example of a counterfactual:
If Oswald had not shot Kennedy, then someone else would have.
Now consider the equation F=ma. This is translated at Wikipedia into the English:
A body of mass m subject to a force F undergoes an acceleration a that has the same direction as the force and a magnitude that is directly proportional to the force and inversely proportional to the mass, i.e., F = ma.
Now suppose that there is a body of mass m floating in space, and that it has not been subject to nor is it currently subject to any force. I believe that the following is a true counterfactual statement about the body:
Had this body (of mass m) been subject to a force F then it would have undergone an acceleration a that would have had the same direction as the force and a magnitude that would have been directly proportional to the force and inversely proportional to the mass.
That is a counterfactual statement following the model of the wikipedia example, and I believe it is true, and I believe that the contradiction of the counterfactual (which is also a counterfactual, i.e., the claim that the body would not have undergone the stated acceleration) is false.
I believe that this point can be extended to all the laws of physics, either Newton’s laws or, if they have been replaced, modern laws. And I believe, furthermore, that the point can be extended to higher-level statements about bodies which are not mere masses moving in space, but, say, thinking creatures making decisions.
Is there any part of this with which you disagree?
A point about the insertion of “I believe”. The phrase “I believe” is sometimes used by people to assert their religious beliefs. I don’t consider the point I am making to be a personal religious belief, but the plain truth. I only insert “I believe” because the very fact that you brought up the issue tells me that I may be in mixed company that includes someone whose philosophical education has instilled certain views.
Are you sure you’re not just worried about poor calibration?
No, my objection is fundamental. I provide a brief explanation in the comment I linked to, but I’ll restate it here briefly.
The problem is that the algorithms that your brain uses to perform common-sense reasoning are not transparent to your conscious mind, which has access only to their final output. This output does not provide a numerical probability estimate, but only a rough and vague feeling of certainty. Yet in most situations, the output of your common sense is all you have. There are very few interesting things you can reason about by performing mathematically rigorous probability calculations (and even when you can, you still have to use common sense to establish the correspondence between the mathematical model and reality).
Therefore, there are only two ways in which you can arrive at a numerical probability estimate for a common-sense belief:
Translate your vague feeling of certainly into a number in some arbitrary manner. This however makes the number a mere figure of speech, which adds absolutely nothing over the usual human vague expressions for different levels of certainty.
Perform some probability calculation, which however has nothing to do with how your brain actually arrived at your common-sense conclusion, and then assign the probability number produced by the former to the latter. This is clearly fallacious.
Honestly, all this seems entirely obvious to me. I would be curious to see which points in the above reasoning are supposed to be even controversial, let alone outright false.
Translate your vague feeling of certainly into a number in some arbitrary manner. This however makes this number a mere figure of speech, which adds absolutely nothing over the usual human vague expressions for different levels of certainty.
Disagree here. Numbers get people to convey more information about their beliefs. It doesn’t matter whether you actually use numbers, or do something similar (and equivalent) like systematize the use of vague expressions. I’d be just as happy if people used a “five-star” system, or even in many cases if they just compared the belief in question to other beliefs used as reference-points.
Perform some probability calculation, which however has nothing to do with how your brain actually arrived at your common-sense conclusion, and then assign the probability number produced by the former to the latter. This is clearly fallacious.
Disagree here also. The probability calculation you present should represent your brain’s reasoning, as revealed by introspection. This is not a perfect process, and may be subject to later refinement. But it is definitely meaningful.
For example, consider my current probability estimate of 10^(-3) that Amanda Knox killed her roommate. On my current analysis, this is obtained as follows: I start with a prior of 10^(-4) (from a general homicide rate of about 10^(-3), plus reasoning that Knox is demographically an order of magnitude less likely to kill than the typical person; the figure happens to match my intuitive sense that I’d have to meet about 10,000 similar people before I’d have any fear for my life). Then all the evidence in the case raises the probability by about an order of magnitude at most, yielding 10^(-3).
Now, this is just a rough order-of-magnitude argument. But it’s already much more meaningful and useful than my just saying “I don’t think she did it”. It provides a way of breaking down the reasoning, so that points of disagreement can be precisely identified in an efficient manner. (If you happened to disagree, the next step would be to say something like “but surely evidence X alone raises the odds by more than a factor of ten”, and then we’d iterate the process specifically on X rather than the original proposition.)
It’s a very useful technique for keeping debates informative, and preventing them from turning into (pure) status signaling contests.
Numbers get people to convey more information about their beliefs. It doesn’t matter whether you actually use numbers, or do something similar (and equivalent) like systematize the use of vague expressions. I’d be just as happy if people used a “five-star” system, or even in many cases if they just compared the belief in question to other beliefs used as reference-points.
If I understand correctly, you’re saying that talking about numbers rather than the usual verbal expressions of certainty prompts people to be more careful and re-examine their reasoning more strictly. This may be true sometimes, but on the other hand, numbers also tend to give a false feeling of accuracy and rigor where there is none. One of the usual symptoms (and, in turn, catalysts) of pseudoscience is the use of numbers with spurious precision and without rigorous justification.
In any case, you seem to concede that these numbers ultimately don’t convey any more information than various vague verbal expressions of confidence. If you want to make the latter more systematic and clear, I have no problem with that, but I see no way to turn them into actual numbers without introducing spurious precision.
The probability calculation you present should represent your brain’s reasoning, as revealed by introspection. This is not a perfect process, and may be subject to later refinement. But it is definitely meaningful.
Trouble is, this is often not possible. Most of what happens in your brain is not amenable to introspection, and you cannot devise a probability calculation that will capture all the important things that happen there. Take your own example:
For example, consider my current probability estimate of 10^(-3) that Amanda Knox killed her roommate. On my current analysis, this is obtained as follows: I start with a prior of 10^(-4) (from a general homicide rate of about 10^(-3), plus reasoning that Knox is demographically an order of magnitude less likely to kill than the typical person; the figure happens to match my intuitive sense that I’d have to meet about 10,000 similar people before I’d have any fear for my life). Then all the evidence in the case raises the probability by about an order of magnitude at most, yielding 10^(-3).
See, this is where, in my opinion, you’re introducing spurious numerical claims that are at best unnecessary and at worst outright misleading.
First you note that murderers are extremely rare, and that AK is a sort of person especially unlikely to be one. OK, say you can justify these numbers by looking at crime statistics. Then you perform a complex common-sense evaluation of the evidence, and your brain tells you that on the whole it’s weak, so it’s highly unlikely that AK killed the victim. So far, so good. But then you insist on turning this feeling of near-certainty about AK’s innocence into a number, and you end up making a quantitative claim that has no justification at all. You say:
Now, this is just a rough order-of-magnitude argument. But it’s already much more meaningful and useful than my just saying “I don’t think she did it”.
I strongly disagree. Neither is this number you came up with any more meaningful than the simple plain statement “I think it’s highly unlikely she did it,” nor does it offer any additional practical benefit. On the contrary, it suggests that you can actually make a mathematically rigorous case that the number is within some well-defined limits. (Which you do disclaim, but which is easy to forget.)
Even worse, your claims suggest that while your numerical estimates may be off by an order of magnitude or so, the model you’re applying to the problem captures reality well enough that it’s only necessary to plug in accurate probability estimates. But how do you know that the model is correct in the first place? Your numbers are ultimately based on an entirely non-mathematical application of common sense in constructing this model—and the uncertainty introduced there is altogether impossible for you to quantify meaningfully.
Let’s see if we can try to hug the query here. What exactly is the mistake I’m making when I say that I believe such-and-such is true with probability 0.001?
Is it that I’m not likely to actually be right 999 times out of 1000 occasions when I say this? If so, then you’re (merely) worried about my calibration, not about the fundamental correspondence between beliefs and probabilities.
Or is it, as you seem now to be suggesting, a question of attire: no one has any business speaking “numerically” unless they’re (metaphorically speaking) “wearing a lab coat”? That is, using numbers is a privilege reserved for scientists who’ve done specific kinds of calculations?
It seems to me that the contrast you are positing between “numerical” statements and other indications of degree is illusory. The only difference is that numbers permit an arbitrarily high level of precision; their use doesn’t automatically imply a particular level. Even in the context of scientific calculations, the numbers involved are subject to some particular level of uncertainty. When a scientist makes a calculation to 15 decimal places, they shouldn’t be interpreted as distinguishing between different 20-decimal-digit numbers.
Likewise, when I make the claim that the probability of Amanda Knox’s guilt is 10^(-3), that should not be interpreted as distinguishing (say) between 0.001 and 0.002. It’s meant to be distinguished from 10^(-2) and (perhaps) 10^(-4). I was explicit about this when I said it was an order-of-magnitude estimate. You may worry that such disclaimers are easily forgotten—but this is to disregard the fact that similar disclaimers always apply whenever numbers are used in any context!
In any case, you seem to concede that these numbers ultimately don’t convey any more information than various vague verbal expressions of confidence. If you want to make the latter more systematic and clear, I have no problem with that, but I see no way to turn them into actual numbers without introducing spurious precision.
Here’s the way I do it: I think approximately in terms of the following “scale” of improbabilities:
(1) 10% to 50% (mundane surprise)
(2) 1% to 10% (rare)
(3) 0.1% (=10^(-3)) to 1% (once-in-a-lifetime level surprise on an important question)
(4) 10^(-6) to 10^(-3) (dying in a plane crash or similar)
(5) 10^(-10) to 10^(-6) (winning the lottery; having an experience unique among humankind)
(6) 10^(-100) to 10^(-10) (religions are true)
(7) below 10^(-100) (theoretical level of improbability reached in thought experiments).
Love the logic and the scale, although I think Vladimir_M pokes some important holes specifically at the 10^(-2) to 10^(-3) level.
May I suggest “un-planned for errors?” In my experience, it is not useful to plan for contingencies with about a 1⁄300 chance in happening per trial. For example, on any given day of the year, my favorite cafe might be closed due to the owner’s illness, but I do not call the cafe first to confirm that it is open each time I go there. At any given time, one of my 300-ish acquaintances is probably nursing a grudge against me, but I do not bother to open each conversation with “Hi, do you still like me today?” When, as inevitably happens, I run into a closed cafe or a hostile friend, I usually stop short for a bit; my planning mechanism reports a bug; there is no ‘action string’ cached for that situation, for the simple reason that I was not expecting the situation, because I did not plan for the situation, because that is how rare it is. Nevertheless, I am not ‘surprised’—I know at some level that things that happen about 1⁄300 times are sort of prone to happening once in a while. On the other hand, I would be ‘surprised’ if my favorite cafe had been burned to the ground or if my erstwhile buddy had taken a permanent vow of silence. I expect that these things will never happen to me, and so if they happen I go and double-check my calculations and assumptions, because it seems equally likely that I am wrong about my assumptions and that the 1⁄30,000 event would actually occur. Anyway, the point is that a category 3 event is an event that makes you shut up for a moment but doesn’t make you reexamine any core beliefs.
If you hold most of your core beliefs with probability > .993 then you are almost certainly overconfident in your core beliefs. I’m not talking about stuff like “my senses offer moderately reliable evidence” or “F(g) = GMm/(r^2)”; I’m talking about stuff like “Solominoff induction predicts that hyperintelligent AIs will employ a timeless decision theory.”
(3) 0.1% (=10^(-3)) to 1% (once-in-a-lifetime level surprise on an important question)
10^-3 is roughly the probability that I try to start my car and it won’t start because the
battery has gone bad. Is the scale intended only for questions one asks once per
lifetime? There are lots of questions that one asks once a day, hence my car example.
That is precisely why I added the phrase “on an important question”. It was intended to rule out exactly those sorts of things.
The intended reference class (for me) consists of matters like the Amanda Knox case. But if I got into the habit of judging similar cases every day, that wouldn’t work either.
What exactly is the mistake I’m making when I say that I believe such-and-such is true with probability 0.001? Is it that I’m not likely to actually be right 999 times out of 1000 occasions when I say this? If so, then you’re (merely) worried about my calibration, not about the fundamental correspondence between beliefs and probabilities.
It’s not that I’m worried about your poor calibration in some particular instance, but that I believe that accurate calibration in this sense is impossible in practice, except in some very special cases.
(To give some sense of the problem, if such calibration were possible, then why not calibrate yourself to generate accurate probabilities about the stock market movements and bet on them? It would be an easy and foolproof way to get rich. But of course that there is no way you can make your numbers match reality, not in this problem, nor in most other ones.)
Or is it, as you seem now to be suggesting, a question of attire: no one has any business speaking “numerically” unless they’re (metaphorically speaking) “wearing a lab coat”? That is, using numbers is a privilege reserved for scientists who’ve done specific kinds of calculations?
The way you put it, “scientists” sounds too exclusive. Carpenters, accountants, cashiers, etc. also use numbers and numerical calculations in valid ways. However, their use of numbers can ultimately be scrutinized and justified in similar ways as the scientific use of numbers (even if they themselves wouldn’t be up to that task), so with that qualification, my answer would be yes.
(And unfortunately, in practice it’s not at all rare to see people using numbers in ways that are fundamentally unsound, which sometimes gives rise to whole edifices of pseudoscience. I discussed one such example from economics in this thread.)
Now, you say:
It seems to me that the contrast you are positing between “numerical” statements and other indications of degree is illusory. The only difference is that numbers permit an arbitrarily high level of precision; their use doesn’t automatically imply a particular level. Even in the context of scientific calculations, the numbers involved are subject to some particular level of uncertainty. When a scientist makes a calculation to 15 decimal places, they shouldn’t be interpreted as distinguishing between different 20-decimal-digit numbers.
However, when a scientist makes a calculation with 15 digits of precision, or even just one, he must be able to rigorously justify this degree of precision by pointing to observations that are incompatible with the hypothesis that any of these digits, except the last one, is different. (Or in the case of mathematical constants such as pi and e, to proofs of the formulas used to calculate them.) This disclaimer is implicit in any scientific use of numbers. (Assuming valid science is being done, of course.)
And this is where, in my opinion, you construct an invalid analogy:
Likewise, when I make the claim that the probability of Amanda Knox’s guilt is 10^(-3), that should not be interpreted as distinguishing (say) between 0.001 and 0.002. It’s meant to be distinguished from 10^(-2) and (perhaps) 10^(-4). I was explicit about this when I said it was an order-of-magnitude estimate. You may worry that such disclaimers are easily forgotten—but this is to disregard the fact that similar disclaimers always apply whenever numbers are used in any context!
But these disclaimers are not at all the same! The scientist’s—or the carpenter’s, for that matter—implicit disclaimer is: “This number is subject to this uncertainty interval, but there is a rigorous argument why it cannot be outside that range.” On the other hand, your disclaimer is: “This number was devised using an intuitive and arbitrary procedure that doesn’t provide any rigorous argument about the range it must be in.”
And if I may be permitted such a comment, I do see lots of instances here where people seem to forget about this disclaimer, and write as if they believed that they could actually become Bayesian inferers, rather than creatures who depend on capricious black-box circuits inside their heads to make any interesting judgment about anything, and who are (with the present level of technology) largely unable to examine the internal functioning of these boxes and improve them.
Here’s the way I do it: I think approximately in terms of the following “scale” of improbabilities:
I don’t think such usage is unreasonable, but I think it falls under what I call using numbers as vague figures of speech.
To give some sense of the problem, if such calibration were possible, then why not calibrate yourself to generate accurate probabilities about the stock market movements and bet on them? It would be an easy and foolproof way to get rich.
I think this statement reflects either an ignorance of finance or the Dark Arts.
First, the stock market is the single worst place to try to test out ideas about probabilities, because so many other people are already trying to predict the market, and so much wealth is at stake. Other people’s predictions will remove most of the potential for arbitrage (reducing ‘signal’), and the insider trading and other forms of cheating generated by the potential for quick wealth will further distort any scientifically detectable trends in the market (increasing ‘noise’). Because investments in the stock market must be made in relatively large quantities to avoid losing your money through trading commissions, a causal theory tester is likely to run out of money long before hitting a good payoff even if he or she is already well-calibrated.
Of course, in real life, people might be moderately-calibrated. The fact that one is capable of making some predictions with some accuracy and precision is not a guarantee that one will be able to reliably and detectably beat even a thin market like a political prediction clearinghouse. Nevertheless, some information is often better than none: I am (rationally) much more concerned about automobile accidents than fires, despite the fact that I know two people who have died in fires and none who have died in automobile accidents. I know this based on my inferences from published statistics, the reliability of which I make further inferences about. I am quite confident (p ~ .95) that it is sensible to drive defensively (at great cost in effort and time) while essentially ignoring fire safety (even though checking a fire extinguisher or smoke detector might take minimal effort.)
I don’t play the stock market, though. I’m not that well calibrated, and probably nobody is without access to inside info of one kind or another.
I think this statement reflects either an ignorance of finance or the Dark Arts.
I’m not an expert on finance, but I am aware of everything you wrote about it in your comment. So I guess this leaves us with the second option. The Dark Arts hypothesis is probably that I’m using the extreme example of the stock market to suggest a general sweeping conclusion that in fact doesn’t hold in less extreme cases.
To which I reply: yes, the stock market is an extreme example, but I honestly can’t think of any other examples that would show otherwise. There are many examples of scientific models that provide more or less accurate probability estimates for all kinds of things, to be sure, but I have yet to hear about people achieving practical success in anything relevant by translating their common-sense feelings of confidence in various beliefs into numerical probabilities.
In my view, calibration of probability estimates can succeed only if (1) you come up with a valid scientific model which you can then use in a shut-up-and-calculate way instead of applying common sense (though you still need it to determine whether the model is applicable in the first place), or (2) you make an essentially identical judgment many times, and from your past performance you extrapolate how frequently the black box inside your head tends to be right.
Now, you try to provide some counterexamples:
I am (rationally) much more concerned about automobile accidents than fires, despite the fact that I know two people who have died in fires and none who have died in automobile accidents. I know this based on my inferences from published statistics, the reliability of which I make further inferences about. I am quite confident (p ~ .95) that it is sensible to drive defensively (at great cost in effort and time) while essentially ignoring fire safety (even though checking a fire extinguisher or smoke detector might take minimal effort.)
Frankly, the only subjective probability estimate I see here is the p~0.95 for your belief about driving. In this case, I’m not getting any more information from this number than if you just described your level of certainty in words, nor do I see any practical application to which you can put this number. I have no objection to your other conclusions, but I see nothing among them that would be controversial to even the most extreme frequentist.
Not sure who voted down your reply; it looks polite and well-reasoned to me.
I believe you when you say that the stock market was honestly intended as representative, although, of course, I continue to disagree about whether it actually is representative.
Here are some more counterexamples:
*When deciding whether to invest in an online bank that pays 1% interest or a local community bank that pays 0.1% interest, I must calculate the odds that each bank will fail before I take my money out; I cannot possibly have a scientific model that generates replicable results for these two banks while also holding down a day job, but numbers will nevertheless help me make a decision that is not driven by an emotional urge to stay with (or leave) an old bank based on customer service considerations that I rationally value as far less than the value of my principal.
*When deciding whether to donate time, money, or neither to a local election campaign, it will help to know which of my donations will have an 10^-6 chance, a 10^-4 chance, and a 10^-2 chance of swinging the election. Numbers are important here because irrational friends and colleagues will urge me to do what ‘feels right’ or to ‘do my part’ without pausing to consider whether this serves any of our goals. If I can generate a replicable scientific model that says whether an extra $500 will win an election, I should stop electioneering and sign up for a job as a tenured political science faculty member, but I nevertheless want to know what the odds are, approximately, in each case, if only so that I can pick which campaign to work on.
As for your objection that:
the only subjective probability estimate I see here is the p~0.95 for your belief about driving. In this case, I’m not getting any more information from this number than if you just described your level of certainty in words,
I suppose I have left a few steps out of my analysis, which I am spelling out in full now:
*Published statistics say that the risk of dying in a fire is 10^-7/people-year and the risk of dying in a car crash is 10^-4/people-year (a report of what is no doubt someone else’s subjective but relatively evidence-based estimate).
*The odds that these statistics are off by more than a factor of 10 relative to each other are less than 10^-1 (a subjective estimate).
*My cost in effort to protect against car crashes is more than 10 times higher than my cost in effort to protect against fires.
*I value the disutility of death-by-fire and death-by-car-crash roughly equally.
*Therefore, there exists a coherent utility function with respect to the relevant states of the world and my relevant strategies such that it is rational for me to protect against car crashes but not fires.
*Therefore, one technique that could be used to show that my behavior is internally incoherent has failed to reject the null hypothesis.
*Therefore, I have some Bayesian evidence that my behavior is rational.
Please let me know if you still think I’m just putting fancy arithmetic labels on what is essentially ‘frequentist’ reasoning, and, if so, exactly what you mean by ‘frequentist.’ I can Wikipedia the standard definition, but it doesn’t quite seem to fit here, imho.
Regarding your examples with banks and donations, when I imagine myself in such situations, I still don’t see how numbers derived from my own common-sense reasoning can be useful. I can see myself making a decision based a simple common-sense impression that one bank looks less shady, or that it’s bigger and thus more likely to be bailed out, etc. Similarly, I could act on a vague impression that one political candidacy I’d favor is far more hopeless than another, and so on. On the other hand, I could also judge from the results of calculations based on numbers from real expert input, like actuary tables for failures of banks of various types, or the poll numbers for elections, etc.
What I cannot imagine, however, is doing anything sensible and useful with probabilities dreamed up from vague common-sense impressions. For example, looking at a bank, getting the impression that it’s reputable and solid, and then saying, “What’s the probability it will fail before time T? Um.. seems really unlikely… let’s say 0.1%.”, and then using this number to calculate my expected returns.
Now, regarding your example with driving vs. fires, suppose I simply say: “Looking at the statistical tables, it is far more likely to be killed by a car accident than a fire. I don’t see any way in which I’m exceptional in my exposure to either, so if I want to make myself safer, it would be stupid to invest more effort in reducing the chance of fire than in more careful driving.” What precisely have you gained with your calculation relative to this plain and clear English statement?
In particular, what is the significance of these subjectively estimated probabilities like p=10^-1 in step 2? What more does this number tell us than a simple statement like “I don’t think it’s likely”? Also, notice that my earlier comment specifically questioned the meaningfulness and practical usefulness of the numerical claim that p~0.95 for this conclusion, and I don’t see how it comes out of your calculation. These seem to be exactly the sorts of dreamed-up probability numbers whose meaningfulness I’m denying.
It seems plausible to me that routinely assigning numerical probabilities to predictions/beliefs that can be tested and tracking these over time to see how accurate your probabilities are (calibration) can lead to a better ability to reliably translate vague feelings of certainty into numerical probabilities.
There are practical benefits to developing this ability. I would speculate that successful bookies and professional sports bettors are better at this than average for example and that this is an ability they have developed through practice and experience. Anyone who has to make decisions under uncertainty seems like they could benefit from a well developed ability to assign well calibrated numerical probability estimates to vague feelings of certainty. Investors, managers, engineers and others who must deal with uncertainty on a regular basis would surely find this ability useful.
I think a certain degree of skepticism is justified regarding the utility of various specific methods for developing this ability (things like predictionbook.com don’t yet have hard evidence for their effectiveness) but it certainly seems like it is a useful ability to have and so there are good reasons to experiment with various methods that promise to improve calibration.
I agree with most of what you’re saying (in that comment and this one) but I still think that the ability to give well calibrated probability estimates for a particular prediction is instrumentally useful and that it is fairly likely that this is an ability that can be improved with practice. I don’t take this to imply anything about humans performing actual Bayesian calculations either implicitly or explicitly.
I have read most of the responses and still am not sure whether to upvote or not. I doubt among several (possibly overlapping) interpretations of your statement. Could you tell to what extent the following interpretations really reflect what you think?
Confession of frequentism. Only sensible numerical probabilities are those related to frequencies, i.e. either frequencies of outcomes of repeated experiments, or probabilities derived from there. (Creative drawing of reference-class boundaries may be permitted.) Especially, prior probabilities are meaningless.
Any sensible numbers must be produced using procedures that ultimately don’t include any numerical parameters (maybe except small integers like 2,3,4). Any number which isn’t a result of such a procedure is labeled arbitrary, and therefore meaningless. (Observation and measurement, of course, do count as permitted procedures. Admittedly arbitrary steps, like choosing units of measurement, are also permitted.)
Degrees of confidence shall be expressed without reflexive thinking about them. Trying to establish a fixed scale of confidence levels (like impossible—very unlikely—unlikely—possible—likely—very likely—almost certain—certain), or actively trying to compare degrees of confidence in different beliefs is cheating, since such scales can be then converted into numbers using a non-numerical procedure.
The question of whether somebody is well calibrated is confused for some reason. Calibrating people has no sense. Although we may take the “almost certain” statements of a person and look at how often they are true, the resulting frequency has no sense for some reason.
Unlike #3, beliefs can be ordered or classified on some scale (possibly imprecisely), but assigning numerical values brings confusing connotations and should be avoided. Alternatively said, the meaning of subjective probabilities is preserved after monotonous rescaling.
Although, strictly speaking, human reasoning can be modelled as a Bayesian network where beliefs have numerical strengths, human introspection is poor at assessing their values. Declared values more likely depend on anchoring than on the real strength of the belief. Speaking about numbers actually introduces noise into reasoning.
Human reasoning cannot be modelled by Bayesian inference, not even in approximation.
That’s an excellent list of questions! It will help me greatly to systematize my thinking on the topic.
Before replying to the specific items you list, perhaps I should first state the general position I’m coming from, which motivates me to get into discussions of this sort. Namely, it is my firm belief that when we look at the present state of human knowledge, one of the principal sources of confusion, nonsense, and pseudosicence is physics envy, which leads people in all sorts of fields to construct nonsensical edifices of numerology and then pretend, consciously or not, that they’ve reached some sort of exact scientific insight. Therefore, I believe that whenever one encounters people talking about numbers of any sort that look even slightly suspicious, they should be considered guilty until proven otherwise—and this entire business with subjective probability estimates for common-sense beliefs doesn’t come even close to clearing that bar for me.
Now to reply to your list.
(1) Confession of frequentism. Only sensible numerical probabilities are those related to frequencies, i.e. either frequencies of outcomes of repeated experiments, or probabilities derived from there. (Creative drawing of reference-class boundaries may be permitted.) Especially, prior probabilities are meaningless.
(2) Any sensible numbers must be produced using procedures that ultimately don’t include any numerical parameters (maybe except small integers like 2,3,4). Any number which isn’t a result of such a procedure is labeled arbitrary, and therefore meaningless. (Observation and measurement, of course, do count as permitted procedures. Admittedly arbitrary steps, like choosing units of measurement, are also permitted.)
My answer to (1) follows from my opinion about (2).
In my view, a number that gives any information about the real world must ultimately refer, either directly or via some calculation, to something that can be measured or counted (at least in principle, perhaps using a thought-experiment). This doesn’t mean that all sensible numbers have to be derived from concrete empirical measurements; they can also follow from common-sense insight and generalization. For example, reading about Newton’s theory leads to the common-sense insight that it’s a very close approximation of reality under certain assumptions. Now, if we look at the gravity formula F=m1*m2/r^2 (in units set so that G=1), the number 2 in the denominator is not a product of any concrete measurement, but a generalization from common sense. Yet what makes it sensible is that it ultimately refers to measurable reality via a well-defined formula: measure the force between two bodies of known masses at distance r, and you’ll get log(m1*m2/F)/log(r) = 2.
Now, what can we make out of probabilities from this viewpoint? I honestly can’t think of any sensible non-frequentist answer to this question. Subjectivist Bayesian phrases such as “the degree of belief” sound to me entirely ghostlike unless this “degree” is verifiable via some frequentist practical test, at least in principle. In this sense, I do confess frequentism. (Though I don’t wish to subscribe to all the related baggage from various controversies in statistics, much of which is frankly over my head.)
(3) Degrees of confidence shall be expressed without reflexive thinking about them. Trying to establish a fixed scale of confidence levels (like impossible—very unlikely—unlikely—possible—likely—very likely—almost certain—certain), or actively trying to compare degrees of confidence in different beliefs is cheating, since such scales can be then converted into numbers using a non-numerical procedure.
That depends on the concrete problem under consideration, and on the thinker who is considering it. The thinker’s brain produces an answer alongside a more or less fuzzy feeling of confidence, and the human language has the capacity to express these feelings with about the same level of fuziness as that signal. It can be sensible to compare intuitive confidence levels, if such comparison can be put to a practical (i.e. frequentist) test. Eight ordered intuitive levels of certainty might perhaps be too much, but with, say, four levels, I could produce four lists of predictions labeled “almost impossible,” “unlikely,” “likely,” and “almost certain,” such that common-sense would tell us that, with near-certainty, those in each subsequent list would turn out to be true in ever greater proportion.
If I wish to express these probabilities as numbers, however, this is not a legitimate step unless the resulting numbers can be justified in the sense discussed above under (1) and (2). This requires justification both in the sense of defining what aspect of reality they refer to (where frequentism seems like the only answer), and guaranteeing that they will be accurate under empirical tests. If they can be so justified, then we say that the intuitive estimate is “well-calibrated.” However, calibration is usually not possible in practice, and there are only two major exceptions.
The first possible path towards accurate calibration is when the same person performs essentially the same judgment many times, and from the past performance we extract the frequency with which their brain tends to produce the right answer. If this level of accuracy remains roughly constant in time, then it makes sense to attach it as the probability to that person’s future judgments on the topic. This approach treats the relevant operations in the brain as a black box whose behavior, being roughly constant, can be subjected to such extrapolation.
The second possible path is reached when someone has a sufficient level of insight about some problem to cross the fuzzy limit between common-sense thinking and an actual scientific model. Increasingly subtle and accurate thinking about a problem can result in the construction of a mathematical model that approximates reality well enough that when applied in a shut-up-and-calculate way, it yields probability estimates that will be subsequently vindicated empirically.
(Still, deciding whether the model is applicable in some particular situation remains a common-sense problem, and the probabilities yielded by the model do not capture this uncertainty. If a well-established physical theory, applied by competent people, says that p=0.9999 for some event, common sense tells me that I should treat this event as near-certain—and, if repeated many times, that it will come out the unlikely way very close to one in 10,000 times. On the other hand, if p=0.9999 is produced by some suspicious model that looks like it might be a product of data-dredging rather than real insight about reality, common sense tells me that the event is not at all certain. But there is no way to capture this intuitive uncertainty with a sensible number. The probabilities coming from calibration of repeated judgment are subject to analogous unquantifiable uncertainty.)
There is also a third logical possibility, namely that some people in some situations have precise enough intuitions of certaintly that they can quantify them in an accurate way, just like some people can guess what time it is with remarkable precision without looking at the clock. But I see little evidence of this occurring in reality, and even if it does, these are very rare special cases.
(4) The question of whether somebody is well calibrated is confused for some reason. Calibrating people has no sense. Although we may take the “almost certain” statements of a person and look at how often they are true, the resulting frequency has no sense for some reason.
I disagree with this, as explained above. Calibration can be done successfully in the special cases I mentioned. However, in cases where it cannot be done, which includes the great majority of the actual beliefs and conclusions made by human brains, devising numerical probabilities makes no sense.
(5) Unlike #3, beliefs can be ordered or classified on some scale (possibly imprecisely), but assigning numerical values brings confusing connotations and should be avoided. Alternatively said, the meaning subjective probabilities is preserved after monotonous rescaling.
This should be clear from the answer to (3).
[Continued in a separate comment below due to excessive length.]
I should first state the general position I’m coming from, which motivates me to get into discussions of this sort. Namely, it is my firm belief that when we look at the present state of human knowledge, one of the principal sources of confusion, nonsense, and pseudosicence is physics envy, which leads people in all sorts of fields to construct nonsensical edifices of numerology and then pretend, consciously or not, that they’ve reached some sort of exact scientific insight.
In my view, if someone’s numbers are wrong, that should be dealt with on the object level (e.g. “0.001 is too low”, with arguments for why), rather than retreating to the meta level of “using numbers caused you to err”. The perspective I come from is wanting to avoid the opposite problem, where being vague about one’s beliefs allows one to get away without subjecting them to rigorous scrutiny. (This, too, by the way, is a major hallmark of pseudoscience.)
But I’ll note that even as we continue to argue under opposing rhetorical banners, our disagreement on the practical issue seems to have mostly evaporated; see here for instance. You also do admit in the end that fear of poor calibration is what is underlying your discomfort with numerical probabilities:
If I wish to express these probabilities as numbers, however, this is not a legitimate step unless the resulting numbers can be justified… If they can be so justified, then we say that the intuitive estimate is “well-calibrated.” However, calibration is usually not possible in practice...
As a theoretical matter, I disagree completely with the notion that probabilities are not legitimate or meaningful unless they’re well-calibrated. There is such a thing as a poorly-calibrated Bayesian; it’s a perfectly coherent concept. The Bayesian view of probabilities is that they refer specifically to degrees of belief, and not anything else. We would of course like the beliefs so represented to be as accurate as possible; but they may not be in practice.
If my internal “Bayesian calculator” believes P(X) = 0.001, and X turns out to be true, I’m not made less wrong by having concealed the number, saying “I don’t think X is true” instead. Less embarrassed, perhaps, but not less wrong.
In my view, if someone’s numbers are wrong, that should be dealt with on the object level (e.g. “0.001 is too low”, with arguments for why), rather than retreating to the meta level of “using numbers caused you to err”.
Trouble is, sometimes numbers can be not even wrong, with their very definition lacking logical consistency or any defensible link with reality. It is that category that I am most concerned with, and I believe that it sadly occurs very often in practice, with entire fields of inquiry sometimes degenerating into meaningless games with such numbers. My honest impression is that in our day and age, such numerological fallacies have been responsible for much greater intellectual sins than the opposite fallacy of avoiding scrutiny by excessive vagueness, although the latter phenomenon is not negligible either.
You also do admit in the end that fear of poor calibration is what is underlying your discomfort with numerical probabilities:
Here we seem to be clashing about terminology. I think that “poor calibration” is too much of a euphemism for the situations I have in mind, namely those where sensible calibration is altogether impossible. I would instead use some stronger expression clarifying that the supposed “calibration” is done without any valid basis, not that the result is poor because some unfortunate circumstance occurred in the course of an otherwise sensible procedure.
There is such a thing as a poorly-calibrated Bayesian; it’s a perfectly coherent concept. The Bayesian view of probabilities is that they refer specifically to degrees of belief, and not anything else.
As I explained in the above lengthy comment, I simply don’t find numbers that “refer specifically to degrees of belief, and not anything else” a coherent concept. We seem to be working with fundamentally different philosophical premises here.
Can these numerical “degrees of belief” somehow be linked to observable reality according to the criteria I defined in my reply to the points (1)-(2) above? If not, I don’t see how admitting such concepts can be of any use.
If my internal “Bayesian calculator” believes P(X) = 0.001, and X turns out to be true, I’m not made less wrong by having concealed the number, saying “I don’t think X is true” instead. Less embarrassed, perhaps, but not less wrong.
But if you do this 10,000 times, and the number of times X turns out to be true is small but nowhere close to 10, you are much more wrong than if you had just been saying “X is highly unlikely” all along.
On the other hand, if we’re observing X as a single event in isolation, I don’t see how this tests your probability estimate in any way. But I suspect we have some additional philosophical differences here.
(6) Although, strictly speaking, human reasoning can be modelled as a Bayesian network where beliefs have numerical strengths, human introspection is poor at assessing their values. Declared values more likely depend on anchoring than on the real strength of the belief. Speaking about numbers actually introduces noise into reasoning.
I have revised my view about this somewhat thanks to a shrewd comment by xv15. The use of unjustified numerical probabilities can sometimes be a useful figure of speech that will convey an intuitive feeling of certainty to other people more faithfully than verbal expressions. But the important thing to note here is that the numbers in such situations are mere figures of speech, i.e. expressions that exploit various idiosyncrasies of human language and thinking to transmit hard-to-convey intuitive points via non-literal meanings. It is not legitimate to use these numbers for any other purpose.
Otherwise, I agree. Except in the above-discussed cases, subjective probabilities extracted from common-sense reasoning are at best an unnecessary addition to arguments that would be just as valid and rigorous without them. At worst, they can lead to muddled and incorrect thinking based on a false impression of accuracy, rigor, and insight where there is none, and ultimately to numerological pseudoscience.
Also, we still don’t know whether and to what extent various parts of our brains involved in common-sense reasoning approximate Bayesian networks. It may well be that some, or even all of them do, but the problem is that we cannot look at them and calculate the exact probabilities involved, and these are not available to introspection. The fallacy of radical Bayesianism that is often seen on LW is in the assumption that one can somehow work around this problem so as to meaningfully attach an explicit Bayesian procedure and a numerical probability to each judgment one makes.
Note also that even if my case turns out to be significantly weaker under scrutiny, it may still be a valid counterargument to the frequently voiced position that one can, and should, attach a numerical probability to every judgment one makes.
So, that would be a statement of my position; I’m looking forward to any comments.
Suppose you have two studies, each of which measures and gives a probability for the same thing. The first study has a small sample size, and a not terribly rigorous experimental procedure; the second study has a large sample size, and a more thorough procedure. When called on to make a decision, you would use the probability from the larger study. But if the large study hadn’t been conducted, you wouldn’t give up and act like you didn’t have any probability at all; you’d use the one from the small study. You might have to do some extra sanity checks, and your results wouldn’t be as reliable, but they’d still be better than if you didn’t have a probability at all.
A probability assigned by common-sense reasoning is to a probability that came from a small study, as a probability from a small study is to a probability from a large study. The quality of probabilities varies continuously; you get better probabilities by conducting better studies. By saying that a probability based only on common-sense reasoning is meaningless, I think what you’re really trying to do is set a minimum quality level. Since probabilities that’re based on studies and calculation are generally better than probabilities that aren’t, this is a useful heuristic. However, it is only that, a heuristic; probabilities based on common-sense reasoning can sometimes be quite good, and they are often the only information available anywhere (and they are, therefore, the best information). Not all common-sense-based probabilities are equal; if an expert thinks for an hour and then gives a probability, without doing any calculation, then that probability will be much better than if a layman thinks about it for thirty seconds. The best common-sense probabilities are better than the worst statistical-study probabilities; and besides, there usually aren’t any relevant statistical calculations or studies to compare against.
I think what’s confusing you is an intuition that if someone gives a probability, you should be able to take it as-is and start calculating with it. But suppose you had collected five large studies, and someone gave you the results of a sixth. You wouldn’t take that probability as-is, you’d have to combine it with the other five studies somehow. You would only use the new probability as-is if it was significantly better (larger sample, more trustworthy procedure, etc) than the ones you already had, or you didn’t have any before. Now if there are no good studies, and someone gives you a probability that came from their common-sense reasoning, you almost certainly have a comparably good probability already: your own common-sense reasoning. So you have to combine it. So in a sense, those sorts of probabilities are less meaningful—you discard them when they compete with better probabilities, or at least weight them less—but there’s still a nonzero amount of meaning there.
(Aside: I’ve been stuck for awhile on an article I’m writing called “What Probability Requires”, dealing with this same topic, and seeing you argue the other side has been extremely helpful. I think I’m unstuck now; thank you for that.)
After thinking about your comment, I think this observation comes close to the core of our disagreement:
By saying that a probability based only on common-sense reasoning is meaningless, I think what you’re really trying to do is set a minimum quality level.
Basically, yes. More specifically, the quality level I wish to set is that the numbers must give more useful information than mere verbal expressions of confidence. Otherwise, their use at best simply adds nothing useful, and at worst leads to fallacious reasoning encouraged by a false feeling of accuracy.
Now, there are several possible ways to object my position:
The first is to note that even if not meaningful mathematically, numbers can serve as communication-facilitating figures of speech. I have conceded this point.
The second way is to insist on an absolute principle that one should always attach numerical probabilities to one’s beliefs. I haven’t seen anything in this thread (or elsewhere) yet that would shake my belief in the fallaciousness of this position, or even provide any plausible-seeming argument in favor of it.
The third way is to agree that sometimes attaching numerical probabilities to common-sense judgments makes no sense, but on the other hand, in some cases common-sense reasoning can produce numerical probabilities that will give more useful information than just fuzzy words. After the discussion with mattnewport and others, I agree that there are such cases, but I still maintain that these are rare exceptions. (In my original statement, I took an overly restrictive notion of “common sense”; I admit that in some cases, thinking that could be reasonably called like that is indeed precise enough to produce meaningful numerical probabilities.)
So, to clarify, which exact position do you take in this regard? Or would your position require a fourth item to summarize fairly?
I think what’s confusing you is an intuition that if someone gives a probability, you should be able to take it as-is and start calculating with it. [...] So in a sense, those sorts of probabilities are less meaningful—you discard them when they compete with better probabilities, or at least weight them less—but there’s still a nonzero amount of meaning there.
I agree that there is a non-zero amount of meaning, but the question is whether it exceeds what a simple verbal statement of confidence would convey. If I can’t take a number and start calculating with it, what good is it? (Except for the caveat about possible metaphorical meanings of numbers.)
My response to this ended up being a whole article, which is why it took so long. The short version of my position is, we should attack numbers to beliefs as often as possible, but for instrumental reasons rather than on principle.
As a matter of fact I can think of one reason—a strong reason in my view—that the consciously felt feeling of certainty is liable to be systematically and significantly exaggerated with respect to the true probability assignment assigned by the person’s mental black box—the latter being something that we might in principle elicit through experimentation by putting the same subject through variants of a given scenario. (Think revealed probability assignment—similar to revealed preference as understood by the economists.)
The reason is that whole-hearted commitment is usually best whatever one chooses to do. Consider Buridan’s ass, but with the following alterations. Instead of hay and water, to make it more symmetrical suppose the ass has two buckets of water, one on either side about equally distant. Suppose furthermore that his mental black box assigns a 51% probability to the proposition that the bucket on the right side is closer to him than the bucket on the left side.
The question, then, is what should the ass consciously feel about the probability that the bucket on the right is closest? I propose that given that his black box assigns a 51% probability to this, he should go to the bucket on the right. But given that he should go to the bucket on the right, he should go there without delay, without a hesitating step, because hesitation is merely a waste of time. But how can the ass go there without delay if he is consciously feeling that the probability is 51% that the bucket on the right is closest? That feeling will cause within him uncertainty and hesitation and will slow him down. Therefore it is best if the ass consciously is absolutely convinced that the bucket on the right is closest. This conscious feeling of certainty will speed his step and get him to the water quickly.
So it is best for Buridan’s ass that his consciously felt degrees of certainty are great exaggerations of his mental black box’s probability assignments. I think this generalizes. We should consciously feel much more certain of things than we really are, in order to get ourselves moving.
In fact, if Buridan’s ass’s mental black box assigns exactly 50% probability to the right bucket being the closer one, the mental black box should in effect flip a coin and then delude the conscious self to become entirely convinced that the right (or, depending on the coin flip, the left) bucket is the closest and act accordingly.
This can be applied to the reactions of prey to predators. It is so costly for a prey animal to be eaten, and relatively so not very costly for the prey animal merely to waste a bit of its time running, that a prey animal is most likely to survive to reproduce if it is in the habit of completely believing that there is a predator after it far more often than there really is a predator after it. Even if possible-predator-signals in the environment actually signify predators 10% of the time or less, since the prey animal never knows which of those signals is the predator, the prey needs to run for its very life every single time it senses the possible-predator-signal. For it to do this, it must be fully mentally committed to the proposition that there is in fact a predator after it. There is no reason for the prey animal to have any less than full belief that there is a predator after it, each and every time it senses a possible predator.
I don’t agree with this conflation of commitment and belief. I’ve never had to run from a predator, but when I run to catch a train, I am fully committed to catching the train, although I may be uncertain about whether I will succeed. In fact, the less time I have, the faster I must run, but the less likely I am to catch the train. That only affects my decision to run or not. On making the decision, belief and uncertainty are irrelevant, intention and action are everything.
Maybe some people have to make themselves believe in an outcome they know to be uncertain, in order to achieve it, but that is just a psychological exercise, not a necessary part of action.
The question is not whether there are some examples of commitment which do not involve belief. The question is whether there are (some, many) examples where really, absolutely full commitment does involve belief. I think there are many.
Consider what commitment is. If someone says, “you don’t seem fully committed to this”, what sort of thing might have prompted him to say this? It’s something like, he thinks you aren’t doing everything you could possibly do to help this along. He thinks you are holding back.
You might reply to this criticism, “I am not holding anything back. There is literally nothing more that I can do to further the probability of success, so there is no point in doing more—it would be an empty and possibly counterproductive gesture rather than being an action that truly furthers the chance of success.”
So the important question is, what can a creature do to further the probability of success? Let’s look at you running to catch the train. You claim that believing that you will succeed would not further the success of your effort. Well, of course not! I could have told you that! If you believe that you will succeed, you can become complacent, which runs the risk of slowing you down.
But if you believe that there is something chasing you, that is likely to speed you up.
Your argument is essentially, “my full commitment didn’t involve belief X, therefore you’re wrong”. But belief X is a belief that would have slowed you down. It would have reduced, not furthered, your chance of success. So of course your full commitment didn’t involve belief X.
My point is that it is often the case that a certain consciously felt belief would increase a person’s chances of success, given their chosen course of action. And in light of what commitment is—it is commitment of one’s self and one’s resources to furthering the probability of success—then if a belief would further a chance of success, then full, really full commitment will include that belief.
So I am not conflating conscious belief with commitment. I am saying that conscious belief can be, and often is, involved in the furthering of success, and therefore can be and often is a part of really full commitment. That is no more conflating belief with commitment than saying that a strong fabric makes a good coat conflates fabric with coats.
You’re right that my analogy was inaccurate: what corresponds in the train-catching scenario to believing there is a predator is my belief that I need to catch this train.
My point is that it is often the case that a certain consciously felt belief would increase a person’s chances of success, given their chosen course of action. And in light of what commitment is—it is commitment of one’s self and one’s resources to furthering the probability of success—then if a belief would further a chance of success, then full, really full commitment will include that belief.
A stronger belief may produce stronger commitment, but strong commitment does not require strong belief. The animal either flees or does not, because a half-hearted sprint will have no effect on the outcome whether a predator is there or not. Similarly, there’s no point making a half-hearted jog for a train, regardless of how much or little one values catching it.
Belief and commitment to act on the belief are two different parts of the process.
Of course, a lot of the “success” literature urges people to have faith in themselves, to believe in their mission, to cast all doubt aside, etc., and if a tool works for someone I’ve no urge to tell them it shouldn’t. But, personally, I take Yoda’s attitude: “Do, or do not.”
Yoda tutors Luke in Jedi philosophy and a practice, which it will take Luke a while to learn. In the meantime, however, Luke is merely an unpolished human. And I am not here recommending a particular philosophy and practice of thought and behavior, but making a prediction about how unpolished humans (and animals) are likely to act. My point is not to recommend that Buridan’s ass should have an exaggerated confidence that the right bucket is closer, but to observe that we can expect him to have an exaggerated confidence, because, for reasons I described, exaggerated confidence is likely to have been selected for because it is likely to have improved the chances of survival of asses who did not have the benefit of Yoda’s instruction.
So I don’t recommend, rather I expect that humans will commonly have conscious feelings of confidence which are exaggerated, and which do not truly reflect the output of the human’s mental black box, his mental machinery to which he does not have access.
Let me explain by the way what I mean here, because I’m saying that the black box can output a 51% probability for Proposition P while at the same time causing the person to be consciously absolutely convinced of the truth of P. This may be confusing, because I seem to be saying that the black box outputs two probabilities, a 51% probability for purposes of decisionmaking and a 100% probability for conscious consumption. So let me explain with an example what I mean.
Suppose you want to test Buridan’s ass to see what probability he assigns to the proposition that the right bucket is closer. What you can do is take the scenario and alter as follows: introduce a mechanism which, with 4% probability, will move the right bucket further than the left bucket before Buridan’s ass gets to it.
Now, if Buridan’s ass assigns a 100% probability that the right bucket is (currently) closer than the left bucket, then taking into account the introduced mechanism, this yields a 96% probability that, by the time the ass gets to it, the right bucket will still be closer to the ass’s starting position. But if Buridan’s ass assigns a 51% probability that the right bucket is (currently) closer than the left bucket, then taking into account the mechanism, this yields approximately a 49% probability (assuming I did the numbers right) that by the time the ass gets to it, the right bucket will be closer.
I am, of course, assuming that the ass is smart enough to understand and incorporate the mechanism into his calculations. Animals have eyes and ears and brains for a reason, so I don’t think it’s a stretch to suppose that there is some way to implement this scenario in a way that an ass really could understand.
So here’s how the test works. You observe that the ass goes to the bucket on the right. You are not sure whether the ass has assigned a 51% probability or a 100% probability to the right bucket being nearer. So you redo the experiment with the added mechanism. If the ass now (with the introduced mechanism) now goes to the bucket on the left, then you can infer that the ass now believes that the probability that the right bucket will be closer by the time he reaches it is less than 50%. But it only changed by a few percentage points as a result of the added mechanism. Therefore he must have assigned only slightly more than 50% probability to it to begin with.
And in this sort of way, you can elicit the ass’s probability assignments.
The ass’s conscious state of mind, however, is something completely separate from this. If we grant the ass the gift of speech, the ass may well say, each time, “there’s not a shred of doubt in my mind that the right bucket is closer”, or “I am entirely confident that the left bucket is closer”.
My point being that we may well be like the ass, and introspective examination of our own conscious state of mind may fail to reveal the actual probabilities that our mental black boxes have assigned to events. It may instead reveal only overconfident delusions that the black box has instilled in the conscious mind for the purpose of encouraging quick action.
Thanks for the lengthy answer. Still, why it is impossible to calibrate people in general, looking at how often they get the anwer right, and then using them as a device for measuring probabilities? If a person is right on approximately 80% of the issues he says he’s “sure”, then why not translating his next “sure” into an 80% probability? Doesn’t seem arbitrary to me. There may be inconsistency between measurements using different people, but strictly speaking, the thermometers and clocks also sometimes disagree.
I do discuss this exact point in the above lengthy comment, and I allow for this possibility. Here is the relevant part:
The first possible path towards accurate calibration is when the same person performs essentially the same judgment many times, and from the past performance we extract the frequency with which their brain tends to produce the right answer. If this level of accuracy remains roughly constant in time, then it makes sense to attach it as the probability to that person’s future judgments on the topic. This approach treats the relevant operations in the brain as a black box whose behavior, being roughly constant, can be subjected to such extrapolation.
Now clearly, the critical part is to ensure that the future judgments are based on the same parts of the person’s brain and that the relevant features of these parts, as well as the problem being solved, remain unchanged. In practice, these requirements can be satisfied by people who have reached the peak of ability achievable by learning from experience in solving some problem that repeatedly occurs in nearly identical form. Still, even in the best case, we’re talking about a very limited number of questions and people here.
I know you have limited it to repeated judgments about essentialy the same question. I was rather asking why, and I am still not sure whether I interpret it correctly. Is it that the judgments themselves are possibly produced by different parts of brain, or the person’s self-evaluation of certainty are produced by different parts of brain, or both? And if so, so what?
Imagine a test is done on a particular person. During five consecutive years he is being asked a lot of questions (of all different types), and he has to give an answer and a subjective feeling of certainty. After that, we see that the answers which he has labeled as “almost certain” were right in 83%, 78%, 81%, 84% and 85% of cases in the five years. Let’s even say that the experimenters were careful enough to divide the questions into different topics, and establish, that his “almost certain” anwers about medicine were right in 94% of the time in average and his “almost certain” answers about politics were right in 56% of the time in average. All other topics were near the overall average.
Do you 1) maintain that such stable results are very unlikely to happen, or that 2) even if most of people can be calibrated is such way, still it doesn’t justify using them for measuring probabilities?
I know you have limited it to repeated judgments about essentialy the same question. I was rather asking why, and I am still not sure whether I interpret it correctly. Is it that the judgments themselves are possibly produced by different parts of brain, or the person’s self-evaluation of certainty are produced by different parts of brain, or both? And if so, so what?
We don’t really know, but it could certainly be both, and also it may well be that the same parts of the brain are not equally reliable for all questions they are capable of processing. Therefore, while simple inductive reasoning tells us that consistent accuracy on the same problem can be extrapolated, there is no ground to generalize to other questions, since they may involve different parts of the brain, or the same part functioning in different modes that don’t have the same accuracy.
Unless, of course, we cover all such various parts and modes and obtain some sort of weighted average over them, which I suppose is the point of your thought experiment, of which more below.
Do you 1) maintain that such stable results are very unlikely to happen, or that 2) even if most of people can be calibrated is such way, still it doesn’t justify using them for measuring probabilities?
If the set of questions remains representative—in the sense of querying the same brain processes with the same frequency—the results could turn out to be fairly stable. This could conceivably be achieved by large and wide-ranging sets of questions. (I wonder if someone has actually done such experiments?)
However, the result could be replicated only if the same person is again asked similar large sets of questions that are representative with regards to the frequencies with which they query different brain processes. Relative to that reference class, it clearly makes sense to attach probabilities to answers, so, yes, here we would have another counterexample for my original claim, for another peculiar meaning of probabilities.
The trouble is that these probabilities would be useless for any purpose that doesn’t involve another similar representative set of questions. In particular, sets of questions about some particular topic that is not representative would presumably not replicate them, and thus they would be a very bad guide for betting that is limited to some particular topic (as it nearly always is). Thus, this seems like an interesting theoretical exercise, but not a way to obtain practically useful numbers.
(I should add that I never thought about this scenario before, so my reasoning here might be wrong.)
If there are any experimental psychologist reading this, maybe they can organise the experiment. I am curious whether people indeed can be calibrated on general questions.
I tell you I believe X with 54% certainty. Who knows, that number could have been generated in a completely bogus way. But however I got here, this is where I am. There are bets about X that I will and won’t take, and guess what, that’s my cutoff probability right there. And by the way, now I have communicated to you where I am, in a way that does not further compound the error.
Meaningless is a very strong word.
In the face of such uncertainty, it could feel natural to take shelter in the idea of “inherent vagueness”...but this is reality, and we place our bets with real dollars and cents, and all the uncertainty in the world collapses to a number in the face of the expectation operator.
So why stop there? If you can justify 54%, then why not go further and calculate a dozen or two more significant digits, and stand behind them all with unshaken resolve?
You can, of course. For most situations, the effort is not worth the trade-off. But making a distinction between 1%, 25%, 50%. 75%. and 99% often is.
You can (at least formally) put error bars on the quantities that go into a Bayesian calculation. The problem, of course, is that error bars are short-hand for a distribution of possible values, and it’s not obvious what a distribution of probabilities means or should mean. Everything operational about probability functions is fully captured by their full set of expectation values, so this is no different than just immediately taking the mean, right?
Well, no. The uncertainties are a higher level model that not only makes predictions, but also calibrates how much these predictions are likely to move given new data.
It seems to me that this is somewhat related to the problem of logical uncertainty.
Again, meaningless is a very strong word, and it does not make your case easy. You seem to be suggesting that NO number, however imprecise, has any place here, and so you do not get to refute me by saying that I have to embrace arbitrary precision.
In any case, if you offer me some bets with more significant digits in the odds, my choices will reveal the cutoff to more significant digits. Wherever it may be, there will still be some bets I will and won’t take, and the number reflects that, which means it carries very real meaning.
Now, maybe I will hold the line at 54% exactly, not feeling any gain to thinking harder about the cutoff (as it gets harder AND less important to nail down further digits). Heck, maybe on some other issue I only care to go out to the nearest 10%. But so what? There are plenty of cases where I know my common sense belief probability to within 10%. That suggests such an estimate is not meaningless.
Again, meaningless is a very strong word, and it does not make your case easy.
To be precise, I wrote “meaningless, except perhaps as a vague figure of speech.” I agree that the claim would be too strong without that qualification, but I do believe that “vague figure of speech” is a fair summary of the meaningfulness that is to be found there. (Note also that the claim specifically applies to “common-sense conclusions and beliefs,” not things where there is a valid basis for employing mathematical models that yield numerical probabilities.)
In any case, if you offer me some bets with more significant digits in the odds, my choices will reveal the cutoff to more significant digits. Wherever it may be, there will still be some bets I will and won’t take, and the number reflects that, which means it carries very real meaning.
You seem to be saying that since you perceive this number as meaningful, you will be willing to act on it, and this by itself renders it meaningful, since it serves as guide for your actions. If we define “meaningful” to cover this case, then I agree with you, and this qualification should be added to my above statement. But the sense in which I used the term originally doesn’t cover this case.
Fair. Let me be precise too. I read your original statement as saying that numbers will never add meaning beyond what a vague figure of speech would, i.e. if you say “I strongly believe this” you cannot make your position more clear by attaching a number. That I disagree with. To me it seems clear that:
i) “Common-sense conclusions and beliefs” are held with varying levels of precision.
ii) Often even these beliefs are held with a level of precision that can be best described with a number. (Best=most succinctly, least misinterpretable, etc...indeed it seems to me that sometimes “best” could be replaced with “only.” You will never get people to understand 60% by saying “I reasonably strongly believe”...and yet your belief may be demonstrably closer to 60 than 50 or 70).
I don’t think your statement is defensible from a normal definition of “common sense conclusions,” but you may have internally defined it in such a way as to make your statement true, with a (I think) relatively narrow sense of “meaningfulness” also in mind. For instance if you ignore the role of numbers in transmission of belief from one party to the next, you are a big step closer to being correct.
I don’t think your statement is defensible from a normal definition of “common sense conclusions,” but you may have internally defined it in such a way as to make your statement true, with a (I think) relatively narrow sense of “meaningfulness” also in mind. For instance if you ignore the role of numbers in transmission of belief from one party to the next, you are a big step closer to being correct.
You have a very good point here. For example, a dialog like this could result in a real exchange of useful information:
A: “I think this project will probably fail.” B: “So, you mean you’re, like, 90% sure it will fail?” A: “Um… not really, more like 80%.”
I can imagine a genuine meeting of minds here, where B now has a very good idea of how confident A feels about his prediction. The numbers are still used as mere figures of speech, but “vague” is not a correct way to describe them, since the information has been transmitted in a more precise way than if A had just used verbal qualifiers.
So, I agree that “vague” should probably be removed from my original claim.
Therefore, there are only two ways in which you can arrive at a numerical probability estimate for a common-sense belief:
Translate your vague feeling of certainly into a number in some arbitrary manner. This however makes the number a mere figure of speech, which adds absolutely nothing over the usual human vague expressions for different levels of certainty.
Perform some probability calculation, which however has nothing to do with how your brain actually arrived at your common-sense conclusion, and then assign the probability number produced by the former to the latter. This is clearly fallacious.
On point #2, I agree with you. On point #1, I had the same reaction as xv15. Your example conversation is exactly how I would defend the use of numerical probabilities in conversation. I think you may have confused people with the phrase “vague figure of speech,” which was itself vague.
Vague relative to what? “No idea / kinda sure / pretty sure / very sure?”, the ways that people generally communicate about probability, are much worse. You can throw in other terms like “I suspect” and “absolutely certain” and “very very sure”, but it’s not even clear how these expressions of belief match up with others. In common speech, we really only have about 3-5 degrees of probability. That’s just not enough gradations.
In contrast, when expressing a percentage probability, people only tend to use multiples of 10, certain multiples of 5, 0.01%, 1%, 2%, 98%, 99% and 99.99%. If people use figures like 87%, or any decimal places other than the ones previously mentioned, it’s usually because they are deliberately being ridiculous. (And it’s no coincidence that your example uses multiples of 10.)
I agree with you that feelings of uncertainty are fuzzy, but they aren’t so fuzzy that we can get by with merely 3-5 gradations in all sorts of conversations. On some subjects, our communication becomes more precise when we have 10-20 gradations. Yet there are diminishing returns on more degrees of communicable certainty (due to reasons you correctly describe), so going any higher resolution than 10-20 degrees isn’t useful for anything except jokes.
The numbers are still used as mere figures of speech, but “vague” is not a correct way to describe them, since the information has been transmitted in a more precise way than if A had just used verbal qualifiers.
Yes. Gaining the 10-20 gradations that numbers allow when they are typically used does make conversations relatively more precise than just by tacking on “very very” to your statement of certainty.
It’s similar to the infamous 1-10 rating system for people’s attractiveness. Despite various reasons that rating people with numbers is distasteful, this ranking system persists because, in my view, people find it useful for communicating subjective assessments of attractiveness. Ugly-cute-hot is a 3-point scale. You could add in “gorgeous,” “beautiful,” or modifiers like “smoking hot,” but it’s unclear how these terms rank against each other (and they may express different types of attraction, rather than different degrees). Again, it’s hard to get more than 3-5 degrees using plain English. The 1-10 scale (with half-points, and 9.9) gives you about 20 gradations (though 1-3, and any half-point values below 5 are rarely used).
I think we have a generalized phenomenon where people resort to using numbers to describe their subjective feelings when common language doesn’t grant high enough resolution. 3-5 is good enough for some feelings (3 gives you negative, neutral, and positive for instance), but for some feelings we need more. Somewhere around 20 is the upper-bound of useful gradations.
I mostly agree with this assessment. However, the key point is that such uses of numbers should be seen as metaphorical. The literal meaning of a metaphor is typically nonsensical, but it works by somehow hacking the human understanding of language to successfully convey a point with greater precision than the most precise literal statement would allow, at least in as many words. (There are other functions of metaphors too, of course, but this one is relevant here.) And just like it is fallacious to understand a metaphor literally, it is similarly fallacious to interpret these numerical metaphors as useful for mathematical purposes. When it comes to subjective probabilities, however, I often see what looks like confusion on this point.
It is wrong to use a subjective probability that you got from someone else for mathematical purposes directly, for reasons I expand on in my comment here. But I don’t think that makes them metaphorical, unless you’re using a definition of metaphor that’s very different than the one I am. And you can use a subjective probability which you generated yourself, or combined with your own subjective probability, in calculations. Doing so just comes with the same caveats as using a probability from a study whose sample was too small, or which had some other bad but not entirely fatal flaw.
I will write a reply to that earlier comment of yours a bit later today when I’ll have more time. (I didn’t forget about it, it’s just that I usually answer lengthy comments that deserve a greater time investment later than those where I can fire off replies rapidly during short breaks.)
But in addition to the theme of that comment, I think you’re missing my point about the possible metaphorical quality of numbers. Human verbal expressions have their literal information content, but one can often exploit the idiosyncrasies of the human language interpretation circuits to effectively convey information altogether different from the literal meaning of one’s words. This gives rise to various metaphors and other figures of speech, which humans use in their communication frequently and effectively. (The process is more complex than this simple picture, since frequently used metaphors can eventually come to be understood as literal expressions of their common metaphorical meaning, and this process is gradual. There are also other important considerations about metaphors, but this simple observation is enough to support my point.)
Now, I propose that certain practical uses of numbers in communication should be seen that way too. A literal meaning of a number is that something can ultimately be counted, measured, or calculated to arrive at that number. A metaphorical use of a number, however, doesn’t convey any such meaning, but merely expects to elicit similar intuitive impressions, which would be difficult or even impossible to communicate precisely using ordinary words. And just like a verbal metaphor is nonsensical except for the non-literal intuitive point it conveys, and its literal meaning should be discarded, at least some practical uses of numbers in human conversations serve only to communicate intuitive points, and the actual values are otherwise nonsensical and should not be used for any other purposes—and even if they perhaps are, their metaphorical value should be clearly seen apart from their literal mathematical value.
Therefore, regardless of our disagreement about subjective probabilities (of which more in my planned reply), this is a separate important point I wanted to make.
okay. I still suspect I disagree with whatever you mean by mere “figures of speech,” but this rational truthseeker does not have infinite time or energy.
in any case, thank you for a productive and civil exchange.
Even if you believe that my position is fallacious, I am sure not the one to be accused of arbitrariness here. Arbitrariness is exactly what I object to, in the sense of insisting on the validity of numbers that lack both logically correct justification and clear error bars that would follow from it. And I’m asking the above question in full seriousness: a Bayesian probability calculation will give you as many significant digits as you want, so if you believe that it makes sense to extract a Bayesian probability with two significant digits from your common sense reasoning, why not more than that?
In any case, I have explained my position at length, and it would be nice if you addressed the substance of what I wrote instead of trying to come up with witty one-liner jabs. For those who want the latter, there are other places on the web full of people whose talent for such things is considerably greater than yours.
For those who want the latter, there are other places on the web full of people whose talent for such things is considerably greater than yours.
I specifically object to your implied argument in the grandparent. I will continue to reject comments that make that mistake regardless of how many times you insult me.
Look, in this thread, you have clearly been making jabs for rhetorical effect, without any attempt to argue in a clear and constructive manner. I am calling you out on that, and if you perceive that as insulting, then so be it.
Everything I wrote here has been perfectly honest and upfront, and written with the goal of eliciting rational counter-arguments from which I might perhaps change my opinion. I have neither the time nor the inclination for the sort of one-upmanship and showing off that you seem to be after, and even if I were, I would pursue it in some more suitable venue. (Where, among other things, one would indeed expect to see the sort of performance you’re striving for done in a much more skilled and entertaining way.)
Your map is not the territory. If you look a little closer you may find that my points are directed at the topic, and not your ego. In particular, take a second glance at this comment. The very example of betting illustrates the core problem with your position.
I am calling you out on that, and if you perceive that as insulting, then so be it.
The insult would be that you are telling me I’m bad at entertaining one-upmanship. I happen to believe I would be quite good at making such performances were I of a mind and in a context where it suited my goals (dealing with AMOGs, for example).
When dealing with intelligent agents, if you notice that what they are doing does not seem to be effective at achieving their goals it is time to notice your confusion. It is most likely that your model of their motives is inaccurate. Mind reading is hard.
Shultz does know nuthink. Slippery slopes do (arbitrarily) slide in both directions (to either Shultz to Omega in this case). Most importantly, if you cannot assign numbers to confidence levels you will lose money when you try to bet.
Um, so when Nate Silver tells us he’s calculated odds of 2 in 3 that Republicans will control the house after the election, this number should be discarded as noise because it’s a common-sense belief that the Republicans will gain that many seats?
No, of course I didn’t mean anything like that. Here is how I see this situation. Silver has a model, which is ultimately a piece of mathematics telling us that some p=0.667, and for reasons of common sense, Silver believes (assuming he’s being upfront with all this) that this model closely approximates reality in such a way that p can be interpreted, with reasonable accuracy, as the probability of Republicans winning a House majority this November.
Now, when you ask someone which party is likely to win this election, this person’s brain will activate some algorithm that will produce an answer along with some rough level of confidence. Someone completely ignorant about politics might answer that he has no idea, and cannot say anything with any certainty. Other people will predict different results with varying (informally expressed) confidence. Silver himself, or someone else who agrees with his model, might reply that the best answer is whatever the model says (i.e. Republicans win with p=0.667), since it is completely superior to the opaque common-sense algorithms used by the brains of non-mathy political analysts. Others will have greater or lesser confidence in the accuracy of the model, and might take its results into account, with varying weight, alongside other common-sense considerations.
Ultimately, the status of this number depends on the relation between Silver’s model and reality. If you believe that the model is a vast improvement over any informal common-sense considerations in predicting election results, just like Newton’s theory is a vast improvement over any common-sense considerations in predicting the motions of planets, then we’re not talking about a common-sense conclusion any more. On the other hand, if you believe that the model is completely out of touch with reality, then you would discard its result as noise. Finally, if you believe that it’s somewhat accurate, but still not reliably superior to common sense, you might revise its conclusion using common sense.
What you believe about Silver’s model, however, is still ultimately a matter of common-sense judgment, and unless you think that you have a model so good that it should be used in a shut-up-and-calculate way, your ultimate best prediction of the election results won’t come with any numerical probabilities, merely a vague feeling of how confident you are.
What you believe about Silver’s model, however, is still ultimately a matter of common-sense judgment, and unless you think that you have a model so good that it should be used in a shut-up-and-calculate way, your ultimate best prediction of the election results won’t come with any numerical probabilities, merely a vague feeling of how confident you are.
For just about any interesting question you may ask, the algorithm that your brain uses to find the answer is not transparent to your consciousness—and its output doesn’t include a numerical probability estimate, merely a vague and coarsely graded feeling of certainty.
Do you not think that this feeling response can be trained through calibration exercises and by making and checking predictions? I have not done this myself yet, but this is how I’ve thought others became able to assign numerical probabilities with confidence.
Do you not think that this feeling response can be trained through calibration exercises and by making and checking predictions?
Well, sometimes frequentism can come to the rescue, in a sense. If you are repeatedly faced with an identical situation where it’s necessary to make some common-sense judgment, like e.g. on an assembly line, you can look at your past performance to predict how often you’ll be correct in the future. (This assuming you’re not getting better or worse with time, of course.) However, what you’re doing in that case is treating a part of your own brain as a black box whose behavior you’re testing empirically to extrapolate a frequentist rule—you are not performing the judgment itself as a rigorous Bayesian procedure that would give you the probability for the conclusion.
That said, it’s clear that smarter and more knowledgeable people think with greater accuracy and subtlety, so that their intuitive feelings of (un)certainty are also subtler and more accurate. But there is still no magic step that will translate these feelings output by black-box circuits in their brains into numbers that could lay claim to mathematical rigor and accuracy.
you are not performing the judgment itself as a rigorous Bayesian procedure that would give you the probability for the conclusion.
No, but do you think it is meaningless to think of the messy brain procedure (that produces these intuitive feelings) as approximating this rigorous Bayesian procedure? This could probably be quantified using various tests. I don’t dispute that one couldn’t lay claim to mathematical rigor, but I’m not sure that means that any human assignment of numerical probabilities is meaningless.
Yes, with good enough calibration, it does make sense. If you have an assembly line worker whose job is to notice and remove defective items, and he’s been doing it with a steady (say) 99.7% accuracy for a long time, it makes sense to assign p=0.997 to each single judgment he makes about an individual item, and this number can be of practical value in managing production. However, this doesn’t mean that you could improve the worker’s performance by teaching him about Bayesianism; his brain remains a black box. The important point is that the same typically holds for highbrow intellectual tasks too.
Moreover, for the great majority of interesting questions about the world, we don’t have the luxury of a large reference class of trials on which to calibrate. Take for example the recent discussion about the AD-36 virus controversy. If you look at the literature, you’ll presumably form an opinion about this question with a higher or lower certainty, depending on how much confidence you have in your own ability to judge about such matters. But how to calibrate this judgment in order to arrive at a probability estimate? There is no way.
To try to understand your point, I will try to clarify it.
We have very limited access to our mental processes. In fact, in some cases our access to our mental processes is indirect—that is, we only discover what we believe once we have observed how we act. We observe our own act, and from this we can infer that we must have believed such-and-such. We can attempt to reconstruct our own process of thinking, but the process we are modeling is essentially a black box whose internals we are modeling, and the outputs of the black box at any given time are meager. We are of course always using the black box, which gives us a lot of data to go on in an absolute sense, but since the topic is constantly changing and since our beliefs are also in flux, the relevance of most of that data to the correct understanding of a particular act of thinking is unclear. In modeling our own mental processes we are rationalizing, with all the potential pitfalls associated with rationalization.
Nevertheless, this does not stop us from using the familiar gambling method for eliciting probability assessments, understood as willingness to wager. The gambling method, even if it is artificial, is at least reasonable, because every behavior we exhibit involves a kind of wager. However the black box operates, it will produce a certain response for each offered betting odds, from which its probability assignments can be derived. Of course this won’t work if the black box produces inconsistent (i.e. Dutch bookable) responses to the betting odds, but whether and to what degree it does or not is an empirical question. As a matter of fact, you’ve been talking about precision, and I think here’s how we can define the precision of your probability assignment. I’m sure that the black box’s responses to betting odds will be somewhat inconsistent. We can measure how inconsistent they are. There will be a certain gap of a certain size which can be Dutch booked—the bigger the gap the quicker you can be milked. And this will be the measure of the precision of your probability assignment.
But suppose that a person always in effect bets for something given certain odds or above, in whatever manner the bet is put to him, and always bets against if given odds anywhere below, and suppose the cutoff between his betting for and against is some very precise number such as pi to twelve digits. Then that seems to say that the odds his black box assigns is precisely those odds.
You write:
The problem is that the algorithms that your brain uses to perform common-sense reasoning are not transparent to your conscious mind, which has access only to their final output. This output does not provide a numerical probability estimate, but only a rough and vague feeling of certainty.
But I don’t we should be looking at introspectable “output”. The purpose of the brain isn’t to produce rough and vague feelings which we can then appreciate through inner contemplation. The purpose of the brain is to produce action, to decide on a course of action and then move the muscles accordingly. Our introspective power is limited at best. Over a lifetime of knowing ourselves we can probably get pretty good at knowing our own beliefs, but I don’t thing we should think of introspection as the gold standard of measuring a person’s belief. Like preference, belief is revealed in action. And action is what the gambling method of eliciting probability assignments looks at. While the brain produces only rough and vague feelings of certainty for the purposes of one’s own navel-gazing, at the same time it produces very definite behavior, very definite decisions, from which can be derived, at least in principle, probability assignments—and also, as I mention above, precision of those probability assignments.
I grant, by implication, that one’s own probability assignments are not necessarily introspectable. That goes without saying.
You write:
Therefore, there are only two ways in which you can arrive at a numerical probability estimate for a common-sense belief:
Translate your vague feeling of certainly into a number in some arbitrary manner. This however makes the number a mere figure of speech, which adds absolutely nothing over the usual human vague expressions for different levels of certainty.
Perform some probability calculation, which however has nothing to do with how your brain actually arrived at your common-sense conclusion, and then assign the probability number produced by the former to the latter. This is clearly fallacious.
Your first described way takes the vague feeling for the output of the black box. But the purpose of the black box is action, decision, and that is the output that we should be looking at, and it’s the output that the gambling method looks at. And that is a third way of arriving at a numerical probability which you didn’t cover.
Aside from some quibbles that aren’t really worth getting into, I have no significant disagreement with your comments. There is nothing wrong with looking at people’s acts in practice and observing that they behave as if they operated with subjective probability estimates in some range. However, your statement that “one’s own probability assignments are not necessarily introspectable” basically restates my main point, which was exactly about the meaninglessness of analyzing one’s own common-sense judgments to arrive at a numerical probability estimate, which many people here, in contrast, consider to be the right way to increase the accuracy of one’s thinking. (Though I admit that it should probably be worded more precisely to make sure it’s interpreted that way.)
However, your statement that “one’s own probability assignments are not necessarily introspectable” basically restates my main point, which was exactly about the meaninglessness of analyzing one’s own common-sense judgments to arrive at a numerical probability estimate, which many people here, in contrast, consider to be the right way to increase the accuracy of one’s thinking.
As it happens, early on I voted your initial comment down (following the topsy-turvy rules of the main post) because based on my first impression I thought I agreed with you. Reconsideration of your comment in light of the ensuing discussion brought to my mind this seeming objection. But you have disarmed the objection, so I am back to agreement.
Although lots of people here consider it a hallmark of “rationality,” assigning numerical probabilities to common-sense conclusions and beliefs is meaningless, except perhaps as a vague figure of speech. (Absolutely certain.)
I’m not sure whether to chide you or giggle at the self-reference. I suspect, though, that “absolutely certain” is not a confidence level.
I want to vote you down in agreement, but I don’t have enough karma.
It is risky to deprecate something as “meaningless”—a ritual, a practice, a word, an idiom. Risky because the actual meaning may be something very different than you imagine. That seems to be the case here with attaching numbers to subjective probabilities.
The meaning of attaching a number to something lies in how that number may be used to generate a second number that can then be attached to something else. There is no point in providing a number to associate with the variable ‘m’ (i.e. that number is meaningless) unless you simultaneously provide a number to associate with the variable ‘f’ and then plug both into “f=ma” to generate a third number to associate with the variable ‘a’, an number which you can test empirically.
Similarly, a single isolated subjective probability estimate may seem somewhat meaningless in isolation, but if you place it into a context with enough related subjective probability estimates and empirically measured frequencies, then all those probabilities and frequencies can be combined and compared using the standard formulas of Bayesian probability:
P(~A) = 1 - P(A)
P(B|A)*P(A)=P(A&B)=P(A|B)*P(B)
So, if you want to deprecate as “meaningless” my estimate that the Democrats have a 40% chance to maintain their House majority in the next election, go ahead. But you cannot then also deprecate my estimate that the Republicans have a 70% of reaching a House majority. Because the conjunction of those two probability estimates is not meaningless. It is quite respectably false.
I think you’re not drawing a clear enough distinction between two different things, namely the mathematical relationships between numbers, and the correspondence between numbers and reality.
If you ask an astronomer what is the mass of some asteroid, he will presumably give you a number with a few significant digits and and uncertainty interval. If you ask him to justify this number, he will be able to point to some observations that are incompatible with the assumption that the mass is outside this interval, which follows from a mathematical argument based on our best knowledge of physics. If you ask for more significant digits, he will say that we don’t know (and that beyond a certain accuracy, the question doesn’t even make sense, since it’s constantly losing and gathering small bits of mass). That’s what it means for a number to be rigorously justified.
But now imagine that I make an uneducated guess of how heavy this asteroid might be, based on no actual astronomical observation. I do of course know that it must be heavier than a few tons or otherwise it wouldn’t be noticeable from Earth as an identifiable object, and that it must be lighter than 10^20 or so tons since that’s roughly the range where smaller planets are, but it’s clearly nonsensical for me to express that guess with even one digit of precision. Yet I could insist on a precise guess, and claim that it’s “meaningful” in a way analogous to your above justification of subjective probability estimates, by deriving various mathematical and physical implications of this fact. If you deprecate my claim that its mass is 4.5237 x 10^15kg, then you cannot also deprecate my claim that it is a sphere of radius 1km and average density 1000kg/m^3, since the conjunction of these claims is by the sheer force of mathematics false.
Therefore, I don’t see how you can argue that a number is meaningful by merely noting its relationships with other numbers that follow from pure mathematics. Or am I missing something with this analogy?
The only thing you are missing is the first paragraph of my reply. Just because something doesn’t have the kind of meaning you think it ought to have (by virtue of being a number, for example) that doesn’t justify your claim that it is meaningless.
Subjective probabilities of isolated propositions don’t have the kind of meaning you want numbers to have. But they have exactly the kind of meaning I want them to have—specifically they can be used in computations that produce consistent results.
Do you think that the digits of pi beyond the first half dozen are also meaningless?
Perplexed:
Fair enough, but I still don’t see how this solves the problem of the correspondence between numbers and reality. Any number can be used in computations that produce consistent results if you just start plugging it into formulas derived from some consistent mathematical theory. It is when the numbers are used as basis for claims about the real, physical world that I insist on an explanation of how exactly they are derived and how their claimed correspondence with reality is justified.
The digits of pi are an artifact of pure mathematics, so I don’t think it’s a good analogy for what we’re talking about. Once you’ve built up enough mathematics to define lengths of curves in Euclidean geometry, the ratio between the circumference and diameter of a circle follows by pure logic. Any suitable analogy for what we’re talking about must encompass empirical knowledge, and claims which can be falsified by empirical observations.
It doesn’t have to. That is a problem you made up. Other people don’t have to buy in to your view on the proper relationship between numbers and physical reality.
My viewpoint on numbers is somewhere between platonism and formalism. I think that the meaning of a number is a particular structure in my mind. If I have an axiom system that is categorical (and, of course, usually I don’t) then that picture in my mind can be made inter-subjective in that someone who also accepts those axioms can build an isomorphic structure in their own mind. The real world has absolutely nothing to do with Tarski’s semantics—which is where I look to find out what the “meaning” of a number is.
Your complaint that subjective probabilities have no meaning is very much like the complaint of a new convert to atheism who laments that without God, life has no meaning. My advice: stop telling other people what the word “meaning” should mean.
However, if you really need some kind of affirmation, then I will provide some. I agree with you that the numbers used in subjective probabilities are less, … what is the right word, … less empirical than are the numbers you usually find in science classes. Does that make you feel better?
Perplexed:
You probably wouldn’t buy that same argument if it came from a numerologist, though. I don’t think I hold any unusual and exotic views on this relationship, and in fact, I don’t think I have made any philosophical assumptions in this discussion beyond the basic common-sense observation that if you want to use numbers to talk about the real world, they should have a clear connection with something that can be measured or counted to make any sense. I don’t see any relevance of these (otherwise highly interesting) deep questions of the philosophy of math for any of my arguments.
There is nothing philosophically wrong with your position except your choice of the word “meaningless” as an epithet for the use of numbers which cannot be empirically justified. Your choice of that word is pretty much the only reason I am disagreeing with you.
Given your position on the meaninglessness of assigning a numerical probability value to a vague feeling of how likely something is, how would you decide whether you were being offered good odds if offered a bet? If you’re not in the habit of accepting bets, how do you think someone who does this for a living (a bookie for example) should go about deciding on what odds to assign to a given bet?
mattnewport:
In reality, it is rational to bet only with people over whom you have superior relevant knowledge, or with someone who is suffering from an evident failure of common sense. Otherwise, betting is just gambling (which of course can be worthwhile for fun or signaling value). Look at the stock market: it’s pure gambling, unless you have insider knowledge or vastly higher expertise than the average investor.
This is the basic reason why I consider the emphasis on subjective Bayesian probabilities that is so popular here misguided. In technical problems where probability calculations can be helpful, the experts in the field already know how to use them. On the other hand, for the great majority of the relevant beliefs and conclusions you’ll form in life, they offer nothing useful beyond what your vague common sense is already telling you. If you start taking them too seriously, it’s easy to start fooling yourself that your thinking is more accurate and precise than it really is, and if you start actually betting on them, you’ll be just gambling.
I’m not familiar with the details of this business, but from what I understand, bookmakers work in such a way that they’re guaranteed to make a profit no matter what happens. Effectively, they exploit the inconsistencies between different people’s estimates of what the favorable odds are. (If there are bookmakers who stake their profit on some particular outcome, then I’m sure that they have insider knowledge if they can stay profitable.) Now of course, the trick is to come up with a book that is both profitable and offers odds that will sell well, but here we get into the fuzzy art of exploiting people’s biases for profit.
You still have to be able to translate your superior relevant knowledge into odds in order to set the terms of the bet however. Do you not believe that this is an ability that people have varying degrees of aptitude for?
Vastly higher expertise than the average investor would appear to include something like the ability in question—translating your beliefs about the future into a probability such that you can judge whether investments have positive expected value. If you accept that true alpha) exists (and the evidence suggests that though rare a small percentage of the best investors do appear to have positive alpha) then what process do you believe those who possess it use to decide which investments are good and which bad?
What’s your opinion on prediction markets? They seem to produce fairly good probability estimates so presumably the participants must be using some better-than-random process for arriving at numerical probability estimates for their predictions.
They certainly aim for a balanced book but they wouldn’t be very profitable if they were not reasonably competent at setting initial odds (and updating them in the light of new information). If the initial odds are wildly out of line with their customers’ then they won’t be able to make a balanced book.
mattnewport:
They sure do, but in all the examples I can think of, people either just follow their intuition directly when faced with a concrete situation, or employ rigorous science to attack the problem. (It doesn’t have to be the official accredited science, of course; the Venn diagram of official science and valid science features only a partial overlap.) I just don’t see any practical examples of people successfully betting by doing calculations with probability numbers derived from their intuitive feelings of confidence that would go beyond what a mere verbal expression of these feelings would convey. Can you think of any?
Well, if I knew, I would be doing it myself—and I sure wouldn’t be talking about it publicly!
The problem with discussing investment strategies is that any non-trivial public information about this topic necessarily has to be bullshit, or at least drowned in bullshit to the point of being irrecoverable, since exclusive possession of correct information is a sure path to getting rich, but its effectiveness critically depends on exclusivity. Still, I would be surprised to find out that the success of some alpha-achieving investors is based on taking numerical expressions of common-sense confidence seriously.
In a sense, a similar problem faces anyone who aspires to be more “rational” than the average folk in any meaningful sense. Either your “rationality” manifests itself only in irrelevant matters, or you have to ask yourself what is so special and exclusive about you that you’re reaping practical success that eludes so many other people, and in such a way that they can’t just copy your approach.
I agree with this assessment, but the accuracy of information aggregated by a prediction market implies nothing about your own individual certainty. Prediction markets work by cancelling out random errors and enabling specialists who wield esoteric expertise to take advantage of amateurs’ systematic biases. Where your own individual judgment falls within this picture, you cannot know, unless you’re one of these people with esoteric expertise.
I’d speculate that bookies and professional sports bettors are doing something like this. By bookies here I mean primarily the kind of individuals who stand with a chalkboard at race tracks rather than the large companies. They probably use some semi-rigorous / scientific techniques to analyze past form and then mix it with a lot of intuition / expertise together with lots of detailed domain specific knowledge and ‘insider’ info (a particular horse or jockey has recently recovered from an illness or injury and so may perform worse than expected, etc.). They’ll then integrate all of this information together using some non mathematically rigorous opaque mental process and derive a probability estimate which will determine what odds they are willing to offer or accept.
I’ve read a fair bit of material by professional investors and macro hedge fund managers describing their thinking and how they make investment decisions. I think they are often doing something similar. Integrating information derived from rigorous analysis with more fuzzy / intuitive reasoning based on expertise, knowledge and experience and using it to derive probabilities for particular outcomes. They then seek out investments that currently appear to be mis-priced relative to the probabilities they’ve estimated, ideally with a fairly large margin of safety to allow for the imprecise and uncertain nature of their estimates.
It’s entirely possible that this is not what’s going on at all but it appears to me that something like this is a factor in the success of anyone who consistently profits from dealing with risk and uncertainty.
My experience leads me to believe that this is not entirely accurate. Investors are understandably reluctant to share very specific time critical investment ideas for free but they frequently share their thought processes for free and talk in general terms about their approaches and my impression is that they are no more obfuscatory or deliberately misleading than anyone else who talks about their success in any field.
In addition, hedge fund investor letters often share quite specific details of reasoning after the fact once profitable trades have been closed and these kinds of details are commonly elaborated in books and interviews once time-sensitive information has lost most of its value.
This seems to be taking the ethos of the EMH a little far. I comfortably attribute a significant portion of my academic and career success to being more intelligent and a clearer thinker than most people. Anyone here who through a sense of false modesty believes otherwise is probably deluding themselves.
This seems to be the main point of ongoing calibration exercises. If you have a track record of well calibrated predictions then you can gain some confidence that your own individual judgement is sound.
Overall I don’t think we have a massive disagreement here. I agree with most of your reservations and I’m by no means certain that improving one’s own calibration is feasible but I suspect that it might be and it seems sufficiently instrumentally useful that I’m interested in trying to improve my own.
mattnewport:
Your knowledge about these trades seems to be much greater than mine, so I’ll accept these examples. In the meantime, I have expounded my whole view of the topic in a reply to an excellent systematic list of questions posed by prase, and in those terms, this would indicate the existence of what I called the third type of exceptions under point (3). I still maintain that these are rare exceptions in the overall range of human judgments, though, and that my basic point holds for the overwhelming majority of human common-sense thinking.
I don’t think they’re being deliberately misleading. I just think that the whole mechanism by which the public discourse on these topics comes into being inherently generates a nearly impenetrable confusion, which you can dispel to extract useful information only if you are already an expert in the first place. There are many specific reasons for this, but it all ultimately comes down to the stability of the weak EMH equilibrium.
Oh, absolutely! But you’re presumably estimating the rank of your abilities based on some significant accomplishments that most people would indeed find impossible to achieve. What I meant to say (even though I expressed it poorly) is that there is no easy and readily available way to excel at “rationality” in any really relevant matters. This in contrast to the attitude, sometimes seen among the people here, that you can learn about Bayesianism or whatever else and just by virtue of that set yourself apart from the masses in accuracy of thought. The EMH ethos is, in my opinion, a good intellectual antidote against such temptations of hubris.
You’re dodging the question. What if the odds arose from a natural process, so that there isn’t a person on the other side of the bet to compare your state of knowledge against?
I think this is right. The idea that you would be betting against another person is inessential, an unfortunate distraction arising from the choice of thought experiment. Admittedly it’s a natural way to understand the thought experiment, but it’s inessential. The experiment could be revised to exlude it. In fact every moment we make decisions whose outcomes depend on things we don’t know, and in making those decisions we are therefore in effect gambling. We are surrounded by risks, and our decisions reveal our assessment of those risks.
jimrandomh:
Maybe it’s my failure of English comprehension (I’m not a native speaker, as you might guess from my frequent grammatical errors), but when I read the phrase “being offered good odds if offered a bet,” I understood it as asking about a bet with opponents who stand to lose if my guess is right. So, honestly, I wasn’t dodging the question.
But to answer your question, it depends on the concrete case. Some natural processes can be approximated with models that yield useful probability estimates, and faced with some such process, I would of course try to use the best scientific knowledge available to calculate the odds if the stakes are high enough to justify the effort. When this is not possible, however, the only honest answer is that my decision would be guided by whatever intuitive feeling my brain happens to produce after some common-sense consideration, and unless this intuitive feeling told me that losing the bet is extremely unlikely, I would refuse to bet. And I honestly cannot think of a situation where translating this intuitive feeling of certainty into numbers would increase the clarity and accuracy of my thinking, or provide for any useful practical guidelines.
For example, if I come across a ditch and decide to jump over to save the effort of walking around to cross over a bridge, I’m effectively betting that it’s narrow enough to jump over safely. In reality, I’ll feel intuitively either that it’s safe to jump or not, and I’ll act on that feeling, produced by some opaque module for physics calculations in my brain. Of course, my conclusion might be wrong, and as a kid I would occasionally injure myself by judging wrongly in such situations, but how can I possibly quantify this feeling of certainty numerically in a meaningful way? It simply makes no sense. The overwhelming majority of real-life cases where I have to produce some judgment, and perhaps even bet on it, are of this sort.
It would be cool to have a brain that produces confidence estimates for its conclusions with greater precision, but mine simply isn’t like that, and it’s useless to pretend that it is.
Applying the view of probability as willingness to bet, you can’t refuse to reveal your probability assignments. Life continually throws at us risky choices. You can perform risky action X with high-value success Y and high-cost failure Z or you can refuse to perform it, but both actions reveal something about your probability assignments. If you perform the risky action X, it reveals that you assign sufficiently high probability to Y (i.e. low to Z) given the values that you place on Y and Z. If you refuse to perform risky action X, it reveals that you assign sufficiently low probability to Y given the values you place on Y and Z. This is nothing other than your willingness to bet.
In an actual case, your simple yes/no response to a given choice is not enough to reveal your probability assignment and only reveals some information about it (that it is below or above a certain value). But counterfactually, we can imagine infinite variations on the choice you are presented with, and for each of these choices, there is a response which (counterfactually) you would have given. This set of responses manifests your probability assignment (and reveals also its degree of precision). Of course, in real life, we can’t usually conduct an experiment that reveals a substantial portion of this set of counterfactuals, so in real life, we remain in the dark about your probability assignment (unless we find some clever alternative way to elicit it than the direct, brute force test-all-variations approach I have just described). But the counterfactuals are still there, and still define a probability assignment, even if we don’t know what it is.
But this revealed probability assignment is parallel to revealed preference. The point of revealed preference is not to help the consumer make better choices. It is a conceptual and sometimes practical tool of economics. The economist studying people discovers their preferences by observing their purchases. And similarly, we can discover a person’s probability assignments by observing his choices. The purpose need not be to help that person to increase the clarity or accuracy of his own thinking, any more than the purpose of revealed preference is to help the consumer shop.
A person interested in self-knowledge, for whatever reason, might want to observe his own behavior in order to discover his own preferences. I think that people like Roissy in DC may be able to teach women about themselves if they choose to read him, teach them about what they really want in a man by pointing out what their behavior is, pointing out that they pursue certain kinds of men and shun others. Women—along with everybody else—are apparently suffering from many delusions about what they want, thinking they want one thing, but actually wanting another—as revealed by their behavior. This self-knowledge may or may not be helpful, but surely at least some women would be interested in it.
But as a matter of fact your choice is influenced by several factors, including the reward of successfully jumping over the ditch (i.e. the reduction in walking time) and the cost of attempting the jump and failing, along with the width of the gap. As these factors are (counterfactually) varied, a possibly precise picture of your probability assignment may emerge. That is, it may turn out that you are willing to risk the jump if failure would only sprain an ankle, but unwilling to risk the jump if failure is certain death. This would narrow down the probability of success that you have assigned to the jump—it would be probable enough to be worth risking the sprained ankle, but not probable enough to be worth risking certain death. This probability assignment is not necessarily anything that you have immediately available to your conscious awareness, but in principle it can be elicited through experimentation with variations on the scenario.
That’s a startling statement (especially out of context).
Are you asking for a defense of the statement, or do you agree with it and are merely commenting on the way I expressed it?
I’ll give a defense by means of an example. At Wikipedia they give the following example of a counterfactual:
If Oswald had not shot Kennedy, then someone else would have.
Now consider the equation F=ma. This is translated at Wikipedia into the English:
A body of mass m subject to a force F undergoes an acceleration a that has the same direction as the force and a magnitude that is directly proportional to the force and inversely proportional to the mass, i.e., F = ma.
Now suppose that there is a body of mass m floating in space, and that it has not been subject to nor is it currently subject to any force. I believe that the following is a true counterfactual statement about the body:
Had this body (of mass m) been subject to a force F then it would have undergone an acceleration a that would have had the same direction as the force and a magnitude that would have been directly proportional to the force and inversely proportional to the mass.
That is a counterfactual statement following the model of the wikipedia example, and I believe it is true, and I believe that the contradiction of the counterfactual (which is also a counterfactual, i.e., the claim that the body would not have undergone the stated acceleration) is false.
I believe that this point can be extended to all the laws of physics, either Newton’s laws or, if they have been replaced, modern laws. And I believe, furthermore, that the point can be extended to higher-level statements about bodies which are not mere masses moving in space, but, say, thinking creatures making decisions.
Is there any part of this with which you disagree?
A point about the insertion of “I believe”. The phrase “I believe” is sometimes used by people to assert their religious beliefs. I don’t consider the point I am making to be a personal religious belief, but the plain truth. I only insert “I believe” because the very fact that you brought up the issue tells me that I may be in mixed company that includes someone whose philosophical education has instilled certain views.
I am merely commenting. Counterfactuals are counterfactual, and so don’t “exist” and can’t be “there” by their very nature.
Yes, of course, they’re part of how we do our analyses.
Upvoted. Definitely can’t back you on this one.
Are you sure you’re not just worried about poor calibration?
Another upvote. That’s crazy talk.
komponisto:
No, my objection is fundamental. I provide a brief explanation in the comment I linked to, but I’ll restate it here briefly.
The problem is that the algorithms that your brain uses to perform common-sense reasoning are not transparent to your conscious mind, which has access only to their final output. This output does not provide a numerical probability estimate, but only a rough and vague feeling of certainty. Yet in most situations, the output of your common sense is all you have. There are very few interesting things you can reason about by performing mathematically rigorous probability calculations (and even when you can, you still have to use common sense to establish the correspondence between the mathematical model and reality).
Therefore, there are only two ways in which you can arrive at a numerical probability estimate for a common-sense belief:
Translate your vague feeling of certainly into a number in some arbitrary manner. This however makes the number a mere figure of speech, which adds absolutely nothing over the usual human vague expressions for different levels of certainty.
Perform some probability calculation, which however has nothing to do with how your brain actually arrived at your common-sense conclusion, and then assign the probability number produced by the former to the latter. This is clearly fallacious.
Honestly, all this seems entirely obvious to me. I would be curious to see which points in the above reasoning are supposed to be even controversial, let alone outright false.
Disagree here. Numbers get people to convey more information about their beliefs. It doesn’t matter whether you actually use numbers, or do something similar (and equivalent) like systematize the use of vague expressions. I’d be just as happy if people used a “five-star” system, or even in many cases if they just compared the belief in question to other beliefs used as reference-points.
Disagree here also. The probability calculation you present should represent your brain’s reasoning, as revealed by introspection. This is not a perfect process, and may be subject to later refinement. But it is definitely meaningful.
For example, consider my current probability estimate of 10^(-3) that Amanda Knox killed her roommate. On my current analysis, this is obtained as follows: I start with a prior of 10^(-4) (from a general homicide rate of about 10^(-3), plus reasoning that Knox is demographically an order of magnitude less likely to kill than the typical person; the figure happens to match my intuitive sense that I’d have to meet about 10,000 similar people before I’d have any fear for my life). Then all the evidence in the case raises the probability by about an order of magnitude at most, yielding 10^(-3).
Now, this is just a rough order-of-magnitude argument. But it’s already much more meaningful and useful than my just saying “I don’t think she did it”. It provides a way of breaking down the reasoning, so that points of disagreement can be precisely identified in an efficient manner. (If you happened to disagree, the next step would be to say something like “but surely evidence X alone raises the odds by more than a factor of ten”, and then we’d iterate the process specifically on X rather than the original proposition.)
It’s a very useful technique for keeping debates informative, and preventing them from turning into (pure) status signaling contests.
komponisto:
If I understand correctly, you’re saying that talking about numbers rather than the usual verbal expressions of certainty prompts people to be more careful and re-examine their reasoning more strictly. This may be true sometimes, but on the other hand, numbers also tend to give a false feeling of accuracy and rigor where there is none. One of the usual symptoms (and, in turn, catalysts) of pseudoscience is the use of numbers with spurious precision and without rigorous justification.
In any case, you seem to concede that these numbers ultimately don’t convey any more information than various vague verbal expressions of confidence. If you want to make the latter more systematic and clear, I have no problem with that, but I see no way to turn them into actual numbers without introducing spurious precision.
Trouble is, this is often not possible. Most of what happens in your brain is not amenable to introspection, and you cannot devise a probability calculation that will capture all the important things that happen there. Take your own example:
See, this is where, in my opinion, you’re introducing spurious numerical claims that are at best unnecessary and at worst outright misleading.
First you note that murderers are extremely rare, and that AK is a sort of person especially unlikely to be one. OK, say you can justify these numbers by looking at crime statistics. Then you perform a complex common-sense evaluation of the evidence, and your brain tells you that on the whole it’s weak, so it’s highly unlikely that AK killed the victim. So far, so good. But then you insist on turning this feeling of near-certainty about AK’s innocence into a number, and you end up making a quantitative claim that has no justification at all. You say:
I strongly disagree. Neither is this number you came up with any more meaningful than the simple plain statement “I think it’s highly unlikely she did it,” nor does it offer any additional practical benefit. On the contrary, it suggests that you can actually make a mathematically rigorous case that the number is within some well-defined limits. (Which you do disclaim, but which is easy to forget.)
Even worse, your claims suggest that while your numerical estimates may be off by an order of magnitude or so, the model you’re applying to the problem captures reality well enough that it’s only necessary to plug in accurate probability estimates. But how do you know that the model is correct in the first place? Your numbers are ultimately based on an entirely non-mathematical application of common sense in constructing this model—and the uncertainty introduced there is altogether impossible for you to quantify meaningfully.
Let’s see if we can try to hug the query here. What exactly is the mistake I’m making when I say that I believe such-and-such is true with probability 0.001?
Is it that I’m not likely to actually be right 999 times out of 1000 occasions when I say this? If so, then you’re (merely) worried about my calibration, not about the fundamental correspondence between beliefs and probabilities.
Or is it, as you seem now to be suggesting, a question of attire: no one has any business speaking “numerically” unless they’re (metaphorically speaking) “wearing a lab coat”? That is, using numbers is a privilege reserved for scientists who’ve done specific kinds of calculations?
It seems to me that the contrast you are positing between “numerical” statements and other indications of degree is illusory. The only difference is that numbers permit an arbitrarily high level of precision; their use doesn’t automatically imply a particular level. Even in the context of scientific calculations, the numbers involved are subject to some particular level of uncertainty. When a scientist makes a calculation to 15 decimal places, they shouldn’t be interpreted as distinguishing between different 20-decimal-digit numbers.
Likewise, when I make the claim that the probability of Amanda Knox’s guilt is 10^(-3), that should not be interpreted as distinguishing (say) between 0.001 and 0.002. It’s meant to be distinguished from 10^(-2) and (perhaps) 10^(-4). I was explicit about this when I said it was an order-of-magnitude estimate. You may worry that such disclaimers are easily forgotten—but this is to disregard the fact that similar disclaimers always apply whenever numbers are used in any context!
Here’s the way I do it: I think approximately in terms of the following “scale” of improbabilities:
(1) 10% to 50% (mundane surprise)
(2) 1% to 10% (rare)
(3) 0.1% (=10^(-3)) to 1% (once-in-a-lifetime level surprise on an important question)
(4) 10^(-6) to 10^(-3) (dying in a plane crash or similar)
(5) 10^(-10) to 10^(-6) (winning the lottery; having an experience unique among humankind)
(6) 10^(-100) to 10^(-10) (religions are true)
(7) below 10^(-100) (theoretical level of improbability reached in thought experiments).
Love the logic and the scale, although I think Vladimir_M pokes some important holes specifically at the 10^(-2) to 10^(-3) level.
May I suggest “un-planned for errors?” In my experience, it is not useful to plan for contingencies with about a 1⁄300 chance in happening per trial. For example, on any given day of the year, my favorite cafe might be closed due to the owner’s illness, but I do not call the cafe first to confirm that it is open each time I go there. At any given time, one of my 300-ish acquaintances is probably nursing a grudge against me, but I do not bother to open each conversation with “Hi, do you still like me today?” When, as inevitably happens, I run into a closed cafe or a hostile friend, I usually stop short for a bit; my planning mechanism reports a bug; there is no ‘action string’ cached for that situation, for the simple reason that I was not expecting the situation, because I did not plan for the situation, because that is how rare it is. Nevertheless, I am not ‘surprised’—I know at some level that things that happen about 1⁄300 times are sort of prone to happening once in a while. On the other hand, I would be ‘surprised’ if my favorite cafe had been burned to the ground or if my erstwhile buddy had taken a permanent vow of silence. I expect that these things will never happen to me, and so if they happen I go and double-check my calculations and assumptions, because it seems equally likely that I am wrong about my assumptions and that the 1⁄30,000 event would actually occur. Anyway, the point is that a category 3 event is an event that makes you shut up for a moment but doesn’t make you reexamine any core beliefs.
If you hold most of your core beliefs with probability > .993 then you are almost certainly overconfident in your core beliefs. I’m not talking about stuff like “my senses offer moderately reliable evidence” or “F(g) = GMm/(r^2)”; I’m talking about stuff like “Solominoff induction predicts that hyperintelligent AIs will employ a timeless decision theory.”
10^-3 is roughly the probability that I try to start my car and it won’t start because the battery has gone bad. Is the scale intended only for questions one asks once per lifetime? There are lots of questions that one asks once a day, hence my car example.
That is precisely why I added the phrase “on an important question”. It was intended to rule out exactly those sorts of things.
The intended reference class (for me) consists of matters like the Amanda Knox case. But if I got into the habit of judging similar cases every day, that wouldn’t work either.
Think “questions I might write a LW post about”.
komponisto:
It’s not that I’m worried about your poor calibration in some particular instance, but that I believe that accurate calibration in this sense is impossible in practice, except in some very special cases.
(To give some sense of the problem, if such calibration were possible, then why not calibrate yourself to generate accurate probabilities about the stock market movements and bet on them? It would be an easy and foolproof way to get rich. But of course that there is no way you can make your numbers match reality, not in this problem, nor in most other ones.)
The way you put it, “scientists” sounds too exclusive. Carpenters, accountants, cashiers, etc. also use numbers and numerical calculations in valid ways. However, their use of numbers can ultimately be scrutinized and justified in similar ways as the scientific use of numbers (even if they themselves wouldn’t be up to that task), so with that qualification, my answer would be yes.
(And unfortunately, in practice it’s not at all rare to see people using numbers in ways that are fundamentally unsound, which sometimes gives rise to whole edifices of pseudoscience. I discussed one such example from economics in this thread.)
Now, you say:
However, when a scientist makes a calculation with 15 digits of precision, or even just one, he must be able to rigorously justify this degree of precision by pointing to observations that are incompatible with the hypothesis that any of these digits, except the last one, is different. (Or in the case of mathematical constants such as pi and e, to proofs of the formulas used to calculate them.) This disclaimer is implicit in any scientific use of numbers. (Assuming valid science is being done, of course.)
And this is where, in my opinion, you construct an invalid analogy:
But these disclaimers are not at all the same! The scientist’s—or the carpenter’s, for that matter—implicit disclaimer is: “This number is subject to this uncertainty interval, but there is a rigorous argument why it cannot be outside that range.” On the other hand, your disclaimer is: “This number was devised using an intuitive and arbitrary procedure that doesn’t provide any rigorous argument about the range it must be in.”
And if I may be permitted such a comment, I do see lots of instances here where people seem to forget about this disclaimer, and write as if they believed that they could actually become Bayesian inferers, rather than creatures who depend on capricious black-box circuits inside their heads to make any interesting judgment about anything, and who are (with the present level of technology) largely unable to examine the internal functioning of these boxes and improve them.
I don’t think such usage is unreasonable, but I think it falls under what I call using numbers as vague figures of speech.
I think this statement reflects either an ignorance of finance or the Dark Arts.
First, the stock market is the single worst place to try to test out ideas about probabilities, because so many other people are already trying to predict the market, and so much wealth is at stake. Other people’s predictions will remove most of the potential for arbitrage (reducing ‘signal’), and the insider trading and other forms of cheating generated by the potential for quick wealth will further distort any scientifically detectable trends in the market (increasing ‘noise’). Because investments in the stock market must be made in relatively large quantities to avoid losing your money through trading commissions, a causal theory tester is likely to run out of money long before hitting a good payoff even if he or she is already well-calibrated.
Of course, in real life, people might be moderately-calibrated. The fact that one is capable of making some predictions with some accuracy and precision is not a guarantee that one will be able to reliably and detectably beat even a thin market like a political prediction clearinghouse. Nevertheless, some information is often better than none: I am (rationally) much more concerned about automobile accidents than fires, despite the fact that I know two people who have died in fires and none who have died in automobile accidents. I know this based on my inferences from published statistics, the reliability of which I make further inferences about. I am quite confident (p ~ .95) that it is sensible to drive defensively (at great cost in effort and time) while essentially ignoring fire safety (even though checking a fire extinguisher or smoke detector might take minimal effort.)
I don’t play the stock market, though. I’m not that well calibrated, and probably nobody is without access to inside info of one kind or another.
Mass_Driver:
I’m not an expert on finance, but I am aware of everything you wrote about it in your comment. So I guess this leaves us with the second option. The Dark Arts hypothesis is probably that I’m using the extreme example of the stock market to suggest a general sweeping conclusion that in fact doesn’t hold in less extreme cases.
To which I reply: yes, the stock market is an extreme example, but I honestly can’t think of any other examples that would show otherwise. There are many examples of scientific models that provide more or less accurate probability estimates for all kinds of things, to be sure, but I have yet to hear about people achieving practical success in anything relevant by translating their common-sense feelings of confidence in various beliefs into numerical probabilities.
In my view, calibration of probability estimates can succeed only if (1) you come up with a valid scientific model which you can then use in a shut-up-and-calculate way instead of applying common sense (though you still need it to determine whether the model is applicable in the first place), or (2) you make an essentially identical judgment many times, and from your past performance you extrapolate how frequently the black box inside your head tends to be right.
Now, you try to provide some counterexamples:
Frankly, the only subjective probability estimate I see here is the p~0.95 for your belief about driving. In this case, I’m not getting any more information from this number than if you just described your level of certainty in words, nor do I see any practical application to which you can put this number. I have no objection to your other conclusions, but I see nothing among them that would be controversial to even the most extreme frequentist.
Not sure who voted down your reply; it looks polite and well-reasoned to me.
I believe you when you say that the stock market was honestly intended as representative, although, of course, I continue to disagree about whether it actually is representative.
Here are some more counterexamples:
*When deciding whether to invest in an online bank that pays 1% interest or a local community bank that pays 0.1% interest, I must calculate the odds that each bank will fail before I take my money out; I cannot possibly have a scientific model that generates replicable results for these two banks while also holding down a day job, but numbers will nevertheless help me make a decision that is not driven by an emotional urge to stay with (or leave) an old bank based on customer service considerations that I rationally value as far less than the value of my principal.
*When deciding whether to donate time, money, or neither to a local election campaign, it will help to know which of my donations will have an 10^-6 chance, a 10^-4 chance, and a 10^-2 chance of swinging the election. Numbers are important here because irrational friends and colleagues will urge me to do what ‘feels right’ or to ‘do my part’ without pausing to consider whether this serves any of our goals. If I can generate a replicable scientific model that says whether an extra $500 will win an election, I should stop electioneering and sign up for a job as a tenured political science faculty member, but I nevertheless want to know what the odds are, approximately, in each case, if only so that I can pick which campaign to work on.
As for your objection that:
I suppose I have left a few steps out of my analysis, which I am spelling out in full now:
*Published statistics say that the risk of dying in a fire is 10^-7/people-year and the risk of dying in a car crash is 10^-4/people-year (a report of what is no doubt someone else’s subjective but relatively evidence-based estimate).
*The odds that these statistics are off by more than a factor of 10 relative to each other are less than 10^-1 (a subjective estimate).
*My cost in effort to protect against car crashes is more than 10 times higher than my cost in effort to protect against fires.
*I value the disutility of death-by-fire and death-by-car-crash roughly equally.
*Therefore, there exists a coherent utility function with respect to the relevant states of the world and my relevant strategies such that it is rational for me to protect against car crashes but not fires.
*Therefore, one technique that could be used to show that my behavior is internally incoherent has failed to reject the null hypothesis.
*Therefore, I have some Bayesian evidence that my behavior is rational.
Please let me know if you still think I’m just putting fancy arithmetic labels on what is essentially ‘frequentist’ reasoning, and, if so, exactly what you mean by ‘frequentist.’ I can Wikipedia the standard definition, but it doesn’t quite seem to fit here, imho.
Regarding your examples with banks and donations, when I imagine myself in such situations, I still don’t see how numbers derived from my own common-sense reasoning can be useful. I can see myself making a decision based a simple common-sense impression that one bank looks less shady, or that it’s bigger and thus more likely to be bailed out, etc. Similarly, I could act on a vague impression that one political candidacy I’d favor is far more hopeless than another, and so on. On the other hand, I could also judge from the results of calculations based on numbers from real expert input, like actuary tables for failures of banks of various types, or the poll numbers for elections, etc.
What I cannot imagine, however, is doing anything sensible and useful with probabilities dreamed up from vague common-sense impressions. For example, looking at a bank, getting the impression that it’s reputable and solid, and then saying, “What’s the probability it will fail before time T? Um.. seems really unlikely… let’s say 0.1%.”, and then using this number to calculate my expected returns.
Now, regarding your example with driving vs. fires, suppose I simply say: “Looking at the statistical tables, it is far more likely to be killed by a car accident than a fire. I don’t see any way in which I’m exceptional in my exposure to either, so if I want to make myself safer, it would be stupid to invest more effort in reducing the chance of fire than in more careful driving.” What precisely have you gained with your calculation relative to this plain and clear English statement?
In particular, what is the significance of these subjectively estimated probabilities like p=10^-1 in step 2? What more does this number tell us than a simple statement like “I don’t think it’s likely”? Also, notice that my earlier comment specifically questioned the meaningfulness and practical usefulness of the numerical claim that p~0.95 for this conclusion, and I don’t see how it comes out of your calculation. These seem to be exactly the sorts of dreamed-up probability numbers whose meaningfulness I’m denying.
It seems plausible to me that routinely assigning numerical probabilities to predictions/beliefs that can be tested and tracking these over time to see how accurate your probabilities are (calibration) can lead to a better ability to reliably translate vague feelings of certainty into numerical probabilities.
There are practical benefits to developing this ability. I would speculate that successful bookies and professional sports bettors are better at this than average for example and that this is an ability they have developed through practice and experience. Anyone who has to make decisions under uncertainty seems like they could benefit from a well developed ability to assign well calibrated numerical probability estimates to vague feelings of certainty. Investors, managers, engineers and others who must deal with uncertainty on a regular basis would surely find this ability useful.
I think a certain degree of skepticism is justified regarding the utility of various specific methods for developing this ability (things like predictionbook.com don’t yet have hard evidence for their effectiveness) but it certainly seems like it is a useful ability to have and so there are good reasons to experiment with various methods that promise to improve calibration.
I addressed this point in another comment in this thread:
http://lesswrong.com/lw/2sl/the_irrationality_game/2qgm
I agree with most of what you’re saying (in that comment and this one) but I still think that the ability to give well calibrated probability estimates for a particular prediction is instrumentally useful and that it is fairly likely that this is an ability that can be improved with practice. I don’t take this to imply anything about humans performing actual Bayesian calculations either implicitly or explicitly.
I have read most of the responses and still am not sure whether to upvote or not. I doubt among several (possibly overlapping) interpretations of your statement. Could you tell to what extent the following interpretations really reflect what you think?
Confession of frequentism. Only sensible numerical probabilities are those related to frequencies, i.e. either frequencies of outcomes of repeated experiments, or probabilities derived from there. (Creative drawing of reference-class boundaries may be permitted.) Especially, prior probabilities are meaningless.
Any sensible numbers must be produced using procedures that ultimately don’t include any numerical parameters (maybe except small integers like 2,3,4). Any number which isn’t a result of such a procedure is labeled arbitrary, and therefore meaningless. (Observation and measurement, of course, do count as permitted procedures. Admittedly arbitrary steps, like choosing units of measurement, are also permitted.)
Degrees of confidence shall be expressed without reflexive thinking about them. Trying to establish a fixed scale of confidence levels (like impossible—very unlikely—unlikely—possible—likely—very likely—almost certain—certain), or actively trying to compare degrees of confidence in different beliefs is cheating, since such scales can be then converted into numbers using a non-numerical procedure.
The question of whether somebody is well calibrated is confused for some reason. Calibrating people has no sense. Although we may take the “almost certain” statements of a person and look at how often they are true, the resulting frequency has no sense for some reason.
Unlike #3, beliefs can be ordered or classified on some scale (possibly imprecisely), but assigning numerical values brings confusing connotations and should be avoided. Alternatively said, the meaning of subjective probabilities is preserved after monotonous rescaling.
Although, strictly speaking, human reasoning can be modelled as a Bayesian network where beliefs have numerical strengths, human introspection is poor at assessing their values. Declared values more likely depend on anchoring than on the real strength of the belief. Speaking about numbers actually introduces noise into reasoning.
Human reasoning cannot be modelled by Bayesian inference, not even in approximation.
That’s an excellent list of questions! It will help me greatly to systematize my thinking on the topic.
Before replying to the specific items you list, perhaps I should first state the general position I’m coming from, which motivates me to get into discussions of this sort. Namely, it is my firm belief that when we look at the present state of human knowledge, one of the principal sources of confusion, nonsense, and pseudosicence is physics envy, which leads people in all sorts of fields to construct nonsensical edifices of numerology and then pretend, consciously or not, that they’ve reached some sort of exact scientific insight. Therefore, I believe that whenever one encounters people talking about numbers of any sort that look even slightly suspicious, they should be considered guilty until proven otherwise—and this entire business with subjective probability estimates for common-sense beliefs doesn’t come even close to clearing that bar for me.
Now to reply to your list.
My answer to (1) follows from my opinion about (2).
In my view, a number that gives any information about the real world must ultimately refer, either directly or via some calculation, to something that can be measured or counted (at least in principle, perhaps using a thought-experiment). This doesn’t mean that all sensible numbers have to be derived from concrete empirical measurements; they can also follow from common-sense insight and generalization. For example, reading about Newton’s theory leads to the common-sense insight that it’s a very close approximation of reality under certain assumptions. Now, if we look at the gravity formula F=m1*m2/r^2 (in units set so that G=1), the number 2 in the denominator is not a product of any concrete measurement, but a generalization from common sense. Yet what makes it sensible is that it ultimately refers to measurable reality via a well-defined formula: measure the force between two bodies of known masses at distance r, and you’ll get log(m1*m2/F)/log(r) = 2.
Now, what can we make out of probabilities from this viewpoint? I honestly can’t think of any sensible non-frequentist answer to this question. Subjectivist Bayesian phrases such as “the degree of belief” sound to me entirely ghostlike unless this “degree” is verifiable via some frequentist practical test, at least in principle. In this sense, I do confess frequentism. (Though I don’t wish to subscribe to all the related baggage from various controversies in statistics, much of which is frankly over my head.)
That depends on the concrete problem under consideration, and on the thinker who is considering it. The thinker’s brain produces an answer alongside a more or less fuzzy feeling of confidence, and the human language has the capacity to express these feelings with about the same level of fuziness as that signal. It can be sensible to compare intuitive confidence levels, if such comparison can be put to a practical (i.e. frequentist) test. Eight ordered intuitive levels of certainty might perhaps be too much, but with, say, four levels, I could produce four lists of predictions labeled “almost impossible,” “unlikely,” “likely,” and “almost certain,” such that common-sense would tell us that, with near-certainty, those in each subsequent list would turn out to be true in ever greater proportion.
If I wish to express these probabilities as numbers, however, this is not a legitimate step unless the resulting numbers can be justified in the sense discussed above under (1) and (2). This requires justification both in the sense of defining what aspect of reality they refer to (where frequentism seems like the only answer), and guaranteeing that they will be accurate under empirical tests. If they can be so justified, then we say that the intuitive estimate is “well-calibrated.” However, calibration is usually not possible in practice, and there are only two major exceptions.
The first possible path towards accurate calibration is when the same person performs essentially the same judgment many times, and from the past performance we extract the frequency with which their brain tends to produce the right answer. If this level of accuracy remains roughly constant in time, then it makes sense to attach it as the probability to that person’s future judgments on the topic. This approach treats the relevant operations in the brain as a black box whose behavior, being roughly constant, can be subjected to such extrapolation.
The second possible path is reached when someone has a sufficient level of insight about some problem to cross the fuzzy limit between common-sense thinking and an actual scientific model. Increasingly subtle and accurate thinking about a problem can result in the construction of a mathematical model that approximates reality well enough that when applied in a shut-up-and-calculate way, it yields probability estimates that will be subsequently vindicated empirically.
(Still, deciding whether the model is applicable in some particular situation remains a common-sense problem, and the probabilities yielded by the model do not capture this uncertainty. If a well-established physical theory, applied by competent people, says that p=0.9999 for some event, common sense tells me that I should treat this event as near-certain—and, if repeated many times, that it will come out the unlikely way very close to one in 10,000 times. On the other hand, if p=0.9999 is produced by some suspicious model that looks like it might be a product of data-dredging rather than real insight about reality, common sense tells me that the event is not at all certain. But there is no way to capture this intuitive uncertainty with a sensible number. The probabilities coming from calibration of repeated judgment are subject to analogous unquantifiable uncertainty.)
There is also a third logical possibility, namely that some people in some situations have precise enough intuitions of certaintly that they can quantify them in an accurate way, just like some people can guess what time it is with remarkable precision without looking at the clock. But I see little evidence of this occurring in reality, and even if it does, these are very rare special cases.
I disagree with this, as explained above. Calibration can be done successfully in the special cases I mentioned. However, in cases where it cannot be done, which includes the great majority of the actual beliefs and conclusions made by human brains, devising numerical probabilities makes no sense.
This should be clear from the answer to (3).
[Continued in a separate comment below due to excessive length.]
I’ll point out here that reversed stupidity is not intelligence, and that for every possible error, there is an opposite possible error.
In my view, if someone’s numbers are wrong, that should be dealt with on the object level (e.g. “0.001 is too low”, with arguments for why), rather than retreating to the meta level of “using numbers caused you to err”. The perspective I come from is wanting to avoid the opposite problem, where being vague about one’s beliefs allows one to get away without subjecting them to rigorous scrutiny. (This, too, by the way, is a major hallmark of pseudoscience.)
But I’ll note that even as we continue to argue under opposing rhetorical banners, our disagreement on the practical issue seems to have mostly evaporated; see here for instance. You also do admit in the end that fear of poor calibration is what is underlying your discomfort with numerical probabilities:
As a theoretical matter, I disagree completely with the notion that probabilities are not legitimate or meaningful unless they’re well-calibrated. There is such a thing as a poorly-calibrated Bayesian; it’s a perfectly coherent concept. The Bayesian view of probabilities is that they refer specifically to degrees of belief, and not anything else. We would of course like the beliefs so represented to be as accurate as possible; but they may not be in practice.
If my internal “Bayesian calculator” believes P(X) = 0.001, and X turns out to be true, I’m not made less wrong by having concealed the number, saying “I don’t think X is true” instead. Less embarrassed, perhaps, but not less wrong.
komponisto:
Trouble is, sometimes numbers can be not even wrong, with their very definition lacking logical consistency or any defensible link with reality. It is that category that I am most concerned with, and I believe that it sadly occurs very often in practice, with entire fields of inquiry sometimes degenerating into meaningless games with such numbers. My honest impression is that in our day and age, such numerological fallacies have been responsible for much greater intellectual sins than the opposite fallacy of avoiding scrutiny by excessive vagueness, although the latter phenomenon is not negligible either.
Here we seem to be clashing about terminology. I think that “poor calibration” is too much of a euphemism for the situations I have in mind, namely those where sensible calibration is altogether impossible. I would instead use some stronger expression clarifying that the supposed “calibration” is done without any valid basis, not that the result is poor because some unfortunate circumstance occurred in the course of an otherwise sensible procedure.
As I explained in the above lengthy comment, I simply don’t find numbers that “refer specifically to degrees of belief, and not anything else” a coherent concept. We seem to be working with fundamentally different philosophical premises here.
Can these numerical “degrees of belief” somehow be linked to observable reality according to the criteria I defined in my reply to the points (1)-(2) above? If not, I don’t see how admitting such concepts can be of any use.
But if you do this 10,000 times, and the number of times X turns out to be true is small but nowhere close to 10, you are much more wrong than if you had just been saying “X is highly unlikely” all along.
On the other hand, if we’re observing X as a single event in isolation, I don’t see how this tests your probability estimate in any way. But I suspect we have some additional philosophical differences here.
[Continued from the parent comment.]
I have revised my view about this somewhat thanks to a shrewd comment by xv15. The use of unjustified numerical probabilities can sometimes be a useful figure of speech that will convey an intuitive feeling of certainty to other people more faithfully than verbal expressions. But the important thing to note here is that the numbers in such situations are mere figures of speech, i.e. expressions that exploit various idiosyncrasies of human language and thinking to transmit hard-to-convey intuitive points via non-literal meanings. It is not legitimate to use these numbers for any other purpose.
Otherwise, I agree. Except in the above-discussed cases, subjective probabilities extracted from common-sense reasoning are at best an unnecessary addition to arguments that would be just as valid and rigorous without them. At worst, they can lead to muddled and incorrect thinking based on a false impression of accuracy, rigor, and insight where there is none, and ultimately to numerological pseudoscience.
Also, we still don’t know whether and to what extent various parts of our brains involved in common-sense reasoning approximate Bayesian networks. It may well be that some, or even all of them do, but the problem is that we cannot look at them and calculate the exact probabilities involved, and these are not available to introspection. The fallacy of radical Bayesianism that is often seen on LW is in the assumption that one can somehow work around this problem so as to meaningfully attach an explicit Bayesian procedure and a numerical probability to each judgment one makes.
Note also that even if my case turns out to be significantly weaker under scrutiny, it may still be a valid counterargument to the frequently voiced position that one can, and should, attach a numerical probability to every judgment one makes.
So, that would be a statement of my position; I’m looking forward to any comments.
Suppose you have two studies, each of which measures and gives a probability for the same thing. The first study has a small sample size, and a not terribly rigorous experimental procedure; the second study has a large sample size, and a more thorough procedure. When called on to make a decision, you would use the probability from the larger study. But if the large study hadn’t been conducted, you wouldn’t give up and act like you didn’t have any probability at all; you’d use the one from the small study. You might have to do some extra sanity checks, and your results wouldn’t be as reliable, but they’d still be better than if you didn’t have a probability at all.
A probability assigned by common-sense reasoning is to a probability that came from a small study, as a probability from a small study is to a probability from a large study. The quality of probabilities varies continuously; you get better probabilities by conducting better studies. By saying that a probability based only on common-sense reasoning is meaningless, I think what you’re really trying to do is set a minimum quality level. Since probabilities that’re based on studies and calculation are generally better than probabilities that aren’t, this is a useful heuristic. However, it is only that, a heuristic; probabilities based on common-sense reasoning can sometimes be quite good, and they are often the only information available anywhere (and they are, therefore, the best information). Not all common-sense-based probabilities are equal; if an expert thinks for an hour and then gives a probability, without doing any calculation, then that probability will be much better than if a layman thinks about it for thirty seconds. The best common-sense probabilities are better than the worst statistical-study probabilities; and besides, there usually aren’t any relevant statistical calculations or studies to compare against.
I think what’s confusing you is an intuition that if someone gives a probability, you should be able to take it as-is and start calculating with it. But suppose you had collected five large studies, and someone gave you the results of a sixth. You wouldn’t take that probability as-is, you’d have to combine it with the other five studies somehow. You would only use the new probability as-is if it was significantly better (larger sample, more trustworthy procedure, etc) than the ones you already had, or you didn’t have any before. Now if there are no good studies, and someone gives you a probability that came from their common-sense reasoning, you almost certainly have a comparably good probability already: your own common-sense reasoning. So you have to combine it. So in a sense, those sorts of probabilities are less meaningful—you discard them when they compete with better probabilities, or at least weight them less—but there’s still a nonzero amount of meaning there.
(Aside: I’ve been stuck for awhile on an article I’m writing called “What Probability Requires”, dealing with this same topic, and seeing you argue the other side has been extremely helpful. I think I’m unstuck now; thank you for that.)
After thinking about your comment, I think this observation comes close to the core of our disagreement:
Basically, yes. More specifically, the quality level I wish to set is that the numbers must give more useful information than mere verbal expressions of confidence. Otherwise, their use at best simply adds nothing useful, and at worst leads to fallacious reasoning encouraged by a false feeling of accuracy.
Now, there are several possible ways to object my position:
The first is to note that even if not meaningful mathematically, numbers can serve as communication-facilitating figures of speech. I have conceded this point.
The second way is to insist on an absolute principle that one should always attach numerical probabilities to one’s beliefs. I haven’t seen anything in this thread (or elsewhere) yet that would shake my belief in the fallaciousness of this position, or even provide any plausible-seeming argument in favor of it.
The third way is to agree that sometimes attaching numerical probabilities to common-sense judgments makes no sense, but on the other hand, in some cases common-sense reasoning can produce numerical probabilities that will give more useful information than just fuzzy words. After the discussion with mattnewport and others, I agree that there are such cases, but I still maintain that these are rare exceptions. (In my original statement, I took an overly restrictive notion of “common sense”; I admit that in some cases, thinking that could be reasonably called like that is indeed precise enough to produce meaningful numerical probabilities.)
So, to clarify, which exact position do you take in this regard? Or would your position require a fourth item to summarize fairly?
I agree that there is a non-zero amount of meaning, but the question is whether it exceeds what a simple verbal statement of confidence would convey. If I can’t take a number and start calculating with it, what good is it? (Except for the caveat about possible metaphorical meanings of numbers.)
My response to this ended up being a whole article, which is why it took so long. The short version of my position is, we should attack numbers to beliefs as often as possible, but for instrumental reasons rather than on principle.
As a matter of fact I can think of one reason—a strong reason in my view—that the consciously felt feeling of certainty is liable to be systematically and significantly exaggerated with respect to the true probability assignment assigned by the person’s mental black box—the latter being something that we might in principle elicit through experimentation by putting the same subject through variants of a given scenario. (Think revealed probability assignment—similar to revealed preference as understood by the economists.)
The reason is that whole-hearted commitment is usually best whatever one chooses to do. Consider Buridan’s ass, but with the following alterations. Instead of hay and water, to make it more symmetrical suppose the ass has two buckets of water, one on either side about equally distant. Suppose furthermore that his mental black box assigns a 51% probability to the proposition that the bucket on the right side is closer to him than the bucket on the left side.
The question, then, is what should the ass consciously feel about the probability that the bucket on the right is closest? I propose that given that his black box assigns a 51% probability to this, he should go to the bucket on the right. But given that he should go to the bucket on the right, he should go there without delay, without a hesitating step, because hesitation is merely a waste of time. But how can the ass go there without delay if he is consciously feeling that the probability is 51% that the bucket on the right is closest? That feeling will cause within him uncertainty and hesitation and will slow him down. Therefore it is best if the ass consciously is absolutely convinced that the bucket on the right is closest. This conscious feeling of certainty will speed his step and get him to the water quickly.
So it is best for Buridan’s ass that his consciously felt degrees of certainty are great exaggerations of his mental black box’s probability assignments. I think this generalizes. We should consciously feel much more certain of things than we really are, in order to get ourselves moving.
In fact, if Buridan’s ass’s mental black box assigns exactly 50% probability to the right bucket being the closer one, the mental black box should in effect flip a coin and then delude the conscious self to become entirely convinced that the right (or, depending on the coin flip, the left) bucket is the closest and act accordingly.
This can be applied to the reactions of prey to predators. It is so costly for a prey animal to be eaten, and relatively so not very costly for the prey animal merely to waste a bit of its time running, that a prey animal is most likely to survive to reproduce if it is in the habit of completely believing that there is a predator after it far more often than there really is a predator after it. Even if possible-predator-signals in the environment actually signify predators 10% of the time or less, since the prey animal never knows which of those signals is the predator, the prey needs to run for its very life every single time it senses the possible-predator-signal. For it to do this, it must be fully mentally committed to the proposition that there is in fact a predator after it. There is no reason for the prey animal to have any less than full belief that there is a predator after it, each and every time it senses a possible predator.
I don’t agree with this conflation of commitment and belief. I’ve never had to run from a predator, but when I run to catch a train, I am fully committed to catching the train, although I may be uncertain about whether I will succeed. In fact, the less time I have, the faster I must run, but the less likely I am to catch the train. That only affects my decision to run or not. On making the decision, belief and uncertainty are irrelevant, intention and action are everything.
Maybe some people have to make themselves believe in an outcome they know to be uncertain, in order to achieve it, but that is just a psychological exercise, not a necessary part of action.
The question is not whether there are some examples of commitment which do not involve belief. The question is whether there are (some, many) examples where really, absolutely full commitment does involve belief. I think there are many.
Consider what commitment is. If someone says, “you don’t seem fully committed to this”, what sort of thing might have prompted him to say this? It’s something like, he thinks you aren’t doing everything you could possibly do to help this along. He thinks you are holding back.
You might reply to this criticism, “I am not holding anything back. There is literally nothing more that I can do to further the probability of success, so there is no point in doing more—it would be an empty and possibly counterproductive gesture rather than being an action that truly furthers the chance of success.”
So the important question is, what can a creature do to further the probability of success? Let’s look at you running to catch the train. You claim that believing that you will succeed would not further the success of your effort. Well, of course not! I could have told you that! If you believe that you will succeed, you can become complacent, which runs the risk of slowing you down.
But if you believe that there is something chasing you, that is likely to speed you up.
Your argument is essentially, “my full commitment didn’t involve belief X, therefore you’re wrong”. But belief X is a belief that would have slowed you down. It would have reduced, not furthered, your chance of success. So of course your full commitment didn’t involve belief X.
My point is that it is often the case that a certain consciously felt belief would increase a person’s chances of success, given their chosen course of action. And in light of what commitment is—it is commitment of one’s self and one’s resources to furthering the probability of success—then if a belief would further a chance of success, then full, really full commitment will include that belief.
So I am not conflating conscious belief with commitment. I am saying that conscious belief can be, and often is, involved in the furthering of success, and therefore can be and often is a part of really full commitment. That is no more conflating belief with commitment than saying that a strong fabric makes a good coat conflates fabric with coats.
You’re right that my analogy was inaccurate: what corresponds in the train-catching scenario to believing there is a predator is my belief that I need to catch this train.
A stronger belief may produce stronger commitment, but strong commitment does not require strong belief. The animal either flees or does not, because a half-hearted sprint will have no effect on the outcome whether a predator is there or not. Similarly, there’s no point making a half-hearted jog for a train, regardless of how much or little one values catching it.
Belief and commitment to act on the belief are two different parts of the process.
Of course, a lot of the “success” literature urges people to have faith in themselves, to believe in their mission, to cast all doubt aside, etc., and if a tool works for someone I’ve no urge to tell them it shouldn’t. But, personally, I take Yoda’s attitude: “Do, or do not.”
Yoda tutors Luke in Jedi philosophy and a practice, which it will take Luke a while to learn. In the meantime, however, Luke is merely an unpolished human. And I am not here recommending a particular philosophy and practice of thought and behavior, but making a prediction about how unpolished humans (and animals) are likely to act. My point is not to recommend that Buridan’s ass should have an exaggerated confidence that the right bucket is closer, but to observe that we can expect him to have an exaggerated confidence, because, for reasons I described, exaggerated confidence is likely to have been selected for because it is likely to have improved the chances of survival of asses who did not have the benefit of Yoda’s instruction.
So I don’t recommend, rather I expect that humans will commonly have conscious feelings of confidence which are exaggerated, and which do not truly reflect the output of the human’s mental black box, his mental machinery to which he does not have access.
Let me explain by the way what I mean here, because I’m saying that the black box can output a 51% probability for Proposition P while at the same time causing the person to be consciously absolutely convinced of the truth of P. This may be confusing, because I seem to be saying that the black box outputs two probabilities, a 51% probability for purposes of decisionmaking and a 100% probability for conscious consumption. So let me explain with an example what I mean.
Suppose you want to test Buridan’s ass to see what probability he assigns to the proposition that the right bucket is closer. What you can do is take the scenario and alter as follows: introduce a mechanism which, with 4% probability, will move the right bucket further than the left bucket before Buridan’s ass gets to it.
Now, if Buridan’s ass assigns a 100% probability that the right bucket is (currently) closer than the left bucket, then taking into account the introduced mechanism, this yields a 96% probability that, by the time the ass gets to it, the right bucket will still be closer to the ass’s starting position. But if Buridan’s ass assigns a 51% probability that the right bucket is (currently) closer than the left bucket, then taking into account the mechanism, this yields approximately a 49% probability (assuming I did the numbers right) that by the time the ass gets to it, the right bucket will be closer.
I am, of course, assuming that the ass is smart enough to understand and incorporate the mechanism into his calculations. Animals have eyes and ears and brains for a reason, so I don’t think it’s a stretch to suppose that there is some way to implement this scenario in a way that an ass really could understand.
So here’s how the test works. You observe that the ass goes to the bucket on the right. You are not sure whether the ass has assigned a 51% probability or a 100% probability to the right bucket being nearer. So you redo the experiment with the added mechanism. If the ass now (with the introduced mechanism) now goes to the bucket on the left, then you can infer that the ass now believes that the probability that the right bucket will be closer by the time he reaches it is less than 50%. But it only changed by a few percentage points as a result of the added mechanism. Therefore he must have assigned only slightly more than 50% probability to it to begin with.
And in this sort of way, you can elicit the ass’s probability assignments.
The ass’s conscious state of mind, however, is something completely separate from this. If we grant the ass the gift of speech, the ass may well say, each time, “there’s not a shred of doubt in my mind that the right bucket is closer”, or “I am entirely confident that the left bucket is closer”.
My point being that we may well be like the ass, and introspective examination of our own conscious state of mind may fail to reveal the actual probabilities that our mental black boxes have assigned to events. It may instead reveal only overconfident delusions that the black box has instilled in the conscious mind for the purpose of encouraging quick action.
Thanks for the lengthy answer. Still, why it is impossible to calibrate people in general, looking at how often they get the anwer right, and then using them as a device for measuring probabilities? If a person is right on approximately 80% of the issues he says he’s “sure”, then why not translating his next “sure” into an 80% probability? Doesn’t seem arbitrary to me. There may be inconsistency between measurements using different people, but strictly speaking, the thermometers and clocks also sometimes disagree.
I do discuss this exact point in the above lengthy comment, and I allow for this possibility. Here is the relevant part:
Now clearly, the critical part is to ensure that the future judgments are based on the same parts of the person’s brain and that the relevant features of these parts, as well as the problem being solved, remain unchanged. In practice, these requirements can be satisfied by people who have reached the peak of ability achievable by learning from experience in solving some problem that repeatedly occurs in nearly identical form. Still, even in the best case, we’re talking about a very limited number of questions and people here.
I know you have limited it to repeated judgments about essentialy the same question. I was rather asking why, and I am still not sure whether I interpret it correctly. Is it that the judgments themselves are possibly produced by different parts of brain, or the person’s self-evaluation of certainty are produced by different parts of brain, or both? And if so, so what?
Imagine a test is done on a particular person. During five consecutive years he is being asked a lot of questions (of all different types), and he has to give an answer and a subjective feeling of certainty. After that, we see that the answers which he has labeled as “almost certain” were right in 83%, 78%, 81%, 84% and 85% of cases in the five years. Let’s even say that the experimenters were careful enough to divide the questions into different topics, and establish, that his “almost certain” anwers about medicine were right in 94% of the time in average and his “almost certain” answers about politics were right in 56% of the time in average. All other topics were near the overall average.
Do you 1) maintain that such stable results are very unlikely to happen, or that 2) even if most of people can be calibrated is such way, still it doesn’t justify using them for measuring probabilities?
prase:
We don’t really know, but it could certainly be both, and also it may well be that the same parts of the brain are not equally reliable for all questions they are capable of processing. Therefore, while simple inductive reasoning tells us that consistent accuracy on the same problem can be extrapolated, there is no ground to generalize to other questions, since they may involve different parts of the brain, or the same part functioning in different modes that don’t have the same accuracy.
Unless, of course, we cover all such various parts and modes and obtain some sort of weighted average over them, which I suppose is the point of your thought experiment, of which more below.
If the set of questions remains representative—in the sense of querying the same brain processes with the same frequency—the results could turn out to be fairly stable. This could conceivably be achieved by large and wide-ranging sets of questions. (I wonder if someone has actually done such experiments?)
However, the result could be replicated only if the same person is again asked similar large sets of questions that are representative with regards to the frequencies with which they query different brain processes. Relative to that reference class, it clearly makes sense to attach probabilities to answers, so, yes, here we would have another counterexample for my original claim, for another peculiar meaning of probabilities.
The trouble is that these probabilities would be useless for any purpose that doesn’t involve another similar representative set of questions. In particular, sets of questions about some particular topic that is not representative would presumably not replicate them, and thus they would be a very bad guide for betting that is limited to some particular topic (as it nearly always is). Thus, this seems like an interesting theoretical exercise, but not a way to obtain practically useful numbers.
(I should add that I never thought about this scenario before, so my reasoning here might be wrong.)
If there are any experimental psychologist reading this, maybe they can organise the experiment. I am curious whether people indeed can be calibrated on general questions.
I tell you I believe X with 54% certainty. Who knows, that number could have been generated in a completely bogus way. But however I got here, this is where I am. There are bets about X that I will and won’t take, and guess what, that’s my cutoff probability right there. And by the way, now I have communicated to you where I am, in a way that does not further compound the error.
Meaningless is a very strong word.
In the face of such uncertainty, it could feel natural to take shelter in the idea of “inherent vagueness”...but this is reality, and we place our bets with real dollars and cents, and all the uncertainty in the world collapses to a number in the face of the expectation operator.
So why stop there? If you can justify 54%, then why not go further and calculate a dozen or two more significant digits, and stand behind them all with unshaken resolve?
You can, of course. For most situations, the effort is not worth the trade-off. But making a distinction between 1%, 25%, 50%. 75%. and 99% often is.
You can (at least formally) put error bars on the quantities that go into a Bayesian calculation. The problem, of course, is that error bars are short-hand for a distribution of possible values, and it’s not obvious what a distribution of probabilities means or should mean. Everything operational about probability functions is fully captured by their full set of expectation values, so this is no different than just immediately taking the mean, right?
Well, no. The uncertainties are a higher level model that not only makes predictions, but also calibrates how much these predictions are likely to move given new data.
It seems to me that this is somewhat related to the problem of logical uncertainty.
Again, meaningless is a very strong word, and it does not make your case easy. You seem to be suggesting that NO number, however imprecise, has any place here, and so you do not get to refute me by saying that I have to embrace arbitrary precision.
In any case, if you offer me some bets with more significant digits in the odds, my choices will reveal the cutoff to more significant digits. Wherever it may be, there will still be some bets I will and won’t take, and the number reflects that, which means it carries very real meaning.
Now, maybe I will hold the line at 54% exactly, not feeling any gain to thinking harder about the cutoff (as it gets harder AND less important to nail down further digits). Heck, maybe on some other issue I only care to go out to the nearest 10%. But so what? There are plenty of cases where I know my common sense belief probability to within 10%. That suggests such an estimate is not meaningless.
xv15:
To be precise, I wrote “meaningless, except perhaps as a vague figure of speech.” I agree that the claim would be too strong without that qualification, but I do believe that “vague figure of speech” is a fair summary of the meaningfulness that is to be found there. (Note also that the claim specifically applies to “common-sense conclusions and beliefs,” not things where there is a valid basis for employing mathematical models that yield numerical probabilities.)
You seem to be saying that since you perceive this number as meaningful, you will be willing to act on it, and this by itself renders it meaningful, since it serves as guide for your actions. If we define “meaningful” to cover this case, then I agree with you, and this qualification should be added to my above statement. But the sense in which I used the term originally doesn’t cover this case.
Fair. Let me be precise too. I read your original statement as saying that numbers will never add meaning beyond what a vague figure of speech would, i.e. if you say “I strongly believe this” you cannot make your position more clear by attaching a number. That I disagree with. To me it seems clear that:
i) “Common-sense conclusions and beliefs” are held with varying levels of precision. ii) Often even these beliefs are held with a level of precision that can be best described with a number. (Best=most succinctly, least misinterpretable, etc...indeed it seems to me that sometimes “best” could be replaced with “only.” You will never get people to understand 60% by saying “I reasonably strongly believe”...and yet your belief may be demonstrably closer to 60 than 50 or 70).
I don’t think your statement is defensible from a normal definition of “common sense conclusions,” but you may have internally defined it in such a way as to make your statement true, with a (I think) relatively narrow sense of “meaningfulness” also in mind. For instance if you ignore the role of numbers in transmission of belief from one party to the next, you are a big step closer to being correct.
xv15:
You have a very good point here. For example, a dialog like this could result in a real exchange of useful information:
A: “I think this project will probably fail.”
B: “So, you mean you’re, like, 90% sure it will fail?”
A: “Um… not really, more like 80%.”
I can imagine a genuine meeting of minds here, where B now has a very good idea of how confident A feels about his prediction. The numbers are still used as mere figures of speech, but “vague” is not a correct way to describe them, since the information has been transmitted in a more precise way than if A had just used verbal qualifiers.
So, I agree that “vague” should probably be removed from my original claim.
On point #2, I agree with you. On point #1, I had the same reaction as xv15. Your example conversation is exactly how I would defend the use of numerical probabilities in conversation. I think you may have confused people with the phrase “vague figure of speech,” which was itself vague.
Vague relative to what? “No idea / kinda sure / pretty sure / very sure?”, the ways that people generally communicate about probability, are much worse. You can throw in other terms like “I suspect” and “absolutely certain” and “very very sure”, but it’s not even clear how these expressions of belief match up with others. In common speech, we really only have about 3-5 degrees of probability. That’s just not enough gradations.
In contrast, when expressing a percentage probability, people only tend to use multiples of 10, certain multiples of 5, 0.01%, 1%, 2%, 98%, 99% and 99.99%. If people use figures like 87%, or any decimal places other than the ones previously mentioned, it’s usually because they are deliberately being ridiculous. (And it’s no coincidence that your example uses multiples of 10.)
I agree with you that feelings of uncertainty are fuzzy, but they aren’t so fuzzy that we can get by with merely 3-5 gradations in all sorts of conversations. On some subjects, our communication becomes more precise when we have 10-20 gradations. Yet there are diminishing returns on more degrees of communicable certainty (due to reasons you correctly describe), so going any higher resolution than 10-20 degrees isn’t useful for anything except jokes.
Yes. Gaining the 10-20 gradations that numbers allow when they are typically used does make conversations relatively more precise than just by tacking on “very very” to your statement of certainty.
It’s similar to the infamous 1-10 rating system for people’s attractiveness. Despite various reasons that rating people with numbers is distasteful, this ranking system persists because, in my view, people find it useful for communicating subjective assessments of attractiveness. Ugly-cute-hot is a 3-point scale. You could add in “gorgeous,” “beautiful,” or modifiers like “smoking hot,” but it’s unclear how these terms rank against each other (and they may express different types of attraction, rather than different degrees). Again, it’s hard to get more than 3-5 degrees using plain English. The 1-10 scale (with half-points, and 9.9) gives you about 20 gradations (though 1-3, and any half-point values below 5 are rarely used).
I think we have a generalized phenomenon where people resort to using numbers to describe their subjective feelings when common language doesn’t grant high enough resolution. 3-5 is good enough for some feelings (3 gives you negative, neutral, and positive for instance), but for some feelings we need more. Somewhere around 20 is the upper-bound of useful gradations.
I mostly agree with this assessment. However, the key point is that such uses of numbers should be seen as metaphorical. The literal meaning of a metaphor is typically nonsensical, but it works by somehow hacking the human understanding of language to successfully convey a point with greater precision than the most precise literal statement would allow, at least in as many words. (There are other functions of metaphors too, of course, but this one is relevant here.) And just like it is fallacious to understand a metaphor literally, it is similarly fallacious to interpret these numerical metaphors as useful for mathematical purposes. When it comes to subjective probabilities, however, I often see what looks like confusion on this point.
It is wrong to use a subjective probability that you got from someone else for mathematical purposes directly, for reasons I expand on in my comment here. But I don’t think that makes them metaphorical, unless you’re using a definition of metaphor that’s very different than the one I am. And you can use a subjective probability which you generated yourself, or combined with your own subjective probability, in calculations. Doing so just comes with the same caveats as using a probability from a study whose sample was too small, or which had some other bad but not entirely fatal flaw.
I will write a reply to that earlier comment of yours a bit later today when I’ll have more time. (I didn’t forget about it, it’s just that I usually answer lengthy comments that deserve a greater time investment later than those where I can fire off replies rapidly during short breaks.)
But in addition to the theme of that comment, I think you’re missing my point about the possible metaphorical quality of numbers. Human verbal expressions have their literal information content, but one can often exploit the idiosyncrasies of the human language interpretation circuits to effectively convey information altogether different from the literal meaning of one’s words. This gives rise to various metaphors and other figures of speech, which humans use in their communication frequently and effectively. (The process is more complex than this simple picture, since frequently used metaphors can eventually come to be understood as literal expressions of their common metaphorical meaning, and this process is gradual. There are also other important considerations about metaphors, but this simple observation is enough to support my point.)
Now, I propose that certain practical uses of numbers in communication should be seen that way too. A literal meaning of a number is that something can ultimately be counted, measured, or calculated to arrive at that number. A metaphorical use of a number, however, doesn’t convey any such meaning, but merely expects to elicit similar intuitive impressions, which would be difficult or even impossible to communicate precisely using ordinary words. And just like a verbal metaphor is nonsensical except for the non-literal intuitive point it conveys, and its literal meaning should be discarded, at least some practical uses of numbers in human conversations serve only to communicate intuitive points, and the actual values are otherwise nonsensical and should not be used for any other purposes—and even if they perhaps are, their metaphorical value should be clearly seen apart from their literal mathematical value.
Therefore, regardless of our disagreement about subjective probabilities (of which more in my planned reply), this is a separate important point I wanted to make.
okay. I still suspect I disagree with whatever you mean by mere “figures of speech,” but this rational truthseeker does not have infinite time or energy.
in any case, thank you for a productive and civil exchange.
Or, you could slide up your arbitrary and fallacious slippery slope and end up with Shultz.
Even if you believe that my position is fallacious, I am sure not the one to be accused of arbitrariness here. Arbitrariness is exactly what I object to, in the sense of insisting on the validity of numbers that lack both logically correct justification and clear error bars that would follow from it. And I’m asking the above question in full seriousness: a Bayesian probability calculation will give you as many significant digits as you want, so if you believe that it makes sense to extract a Bayesian probability with two significant digits from your common sense reasoning, why not more than that?
In any case, I have explained my position at length, and it would be nice if you addressed the substance of what I wrote instead of trying to come up with witty one-liner jabs. For those who want the latter, there are other places on the web full of people whose talent for such things is considerably greater than yours.
I specifically object to your implied argument in the grandparent. I will continue to reject comments that make that mistake regardless of how many times you insult me.
Look, in this thread, you have clearly been making jabs for rhetorical effect, without any attempt to argue in a clear and constructive manner. I am calling you out on that, and if you perceive that as insulting, then so be it.
Everything I wrote here has been perfectly honest and upfront, and written with the goal of eliciting rational counter-arguments from which I might perhaps change my opinion. I have neither the time nor the inclination for the sort of one-upmanship and showing off that you seem to be after, and even if I were, I would pursue it in some more suitable venue. (Where, among other things, one would indeed expect to see the sort of performance you’re striving for done in a much more skilled and entertaining way.)
Your map is not the territory. If you look a little closer you may find that my points are directed at the topic, and not your ego. In particular, take a second glance at this comment. The very example of betting illustrates the core problem with your position.
The insult would be that you are telling me I’m bad at entertaining one-upmanship. I happen to believe I would be quite good at making such performances were I of a mind and in a context where it suited my goals (dealing with AMOGs, for example).
When dealing with intelligent agents, if you notice that what they are doing does not seem to be effective at achieving their goals it is time to notice your confusion. It is most likely that your model of their motives is inaccurate. Mind reading is hard.
Shultz does know nuthink. Slippery slopes do (arbitrarily) slide in both directions (to either Shultz to Omega in this case). Most importantly, if you cannot assign numbers to confidence levels you will lose money when you try to bet.
Upvoted, because I think you’re only probably right. And you not only stole my thunder, you made it more thunderous :(
Downvote if you agree with something, upvote if you disagree.
EDIT: I missed the word only. I just read “I think you’re probably right.” My mistake.
Upvote for disagreements of overconfidence OR underconfidence.
Same here. A “pretty sure” confidence level would probably have done it for me.
Um, so when Nate Silver tells us he’s calculated odds of 2 in 3 that Republicans will control the house after the election, this number should be discarded as noise because it’s a common-sense belief that the Republicans will gain that many seats?
Boy did I hit a hornets’ nest with this one!
No, of course I didn’t mean anything like that. Here is how I see this situation. Silver has a model, which is ultimately a piece of mathematics telling us that some p=0.667, and for reasons of common sense, Silver believes (assuming he’s being upfront with all this) that this model closely approximates reality in such a way that p can be interpreted, with reasonable accuracy, as the probability of Republicans winning a House majority this November.
Now, when you ask someone which party is likely to win this election, this person’s brain will activate some algorithm that will produce an answer along with some rough level of confidence. Someone completely ignorant about politics might answer that he has no idea, and cannot say anything with any certainty. Other people will predict different results with varying (informally expressed) confidence. Silver himself, or someone else who agrees with his model, might reply that the best answer is whatever the model says (i.e. Republicans win with p=0.667), since it is completely superior to the opaque common-sense algorithms used by the brains of non-mathy political analysts. Others will have greater or lesser confidence in the accuracy of the model, and might take its results into account, with varying weight, alongside other common-sense considerations.
Ultimately, the status of this number depends on the relation between Silver’s model and reality. If you believe that the model is a vast improvement over any informal common-sense considerations in predicting election results, just like Newton’s theory is a vast improvement over any common-sense considerations in predicting the motions of planets, then we’re not talking about a common-sense conclusion any more. On the other hand, if you believe that the model is completely out of touch with reality, then you would discard its result as noise. Finally, if you believe that it’s somewhat accurate, but still not reliably superior to common sense, you might revise its conclusion using common sense.
What you believe about Silver’s model, however, is still ultimately a matter of common-sense judgment, and unless you think that you have a model so good that it should be used in a shut-up-and-calculate way, your ultimate best prediction of the election results won’t come with any numerical probabilities, merely a vague feeling of how confident you are.
Want to make a bet on that?
In your linked comment you write:
Do you not think that this feeling response can be trained through calibration exercises and by making and checking predictions? I have not done this myself yet, but this is how I’ve thought others became able to assign numerical probabilities with confidence.
Luke_Grecki:
Well, sometimes frequentism can come to the rescue, in a sense. If you are repeatedly faced with an identical situation where it’s necessary to make some common-sense judgment, like e.g. on an assembly line, you can look at your past performance to predict how often you’ll be correct in the future. (This assuming you’re not getting better or worse with time, of course.) However, what you’re doing in that case is treating a part of your own brain as a black box whose behavior you’re testing empirically to extrapolate a frequentist rule—you are not performing the judgment itself as a rigorous Bayesian procedure that would give you the probability for the conclusion.
That said, it’s clear that smarter and more knowledgeable people think with greater accuracy and subtlety, so that their intuitive feelings of (un)certainty are also subtler and more accurate. But there is still no magic step that will translate these feelings output by black-box circuits in their brains into numbers that could lay claim to mathematical rigor and accuracy.
No, but do you think it is meaningless to think of the messy brain procedure (that produces these intuitive feelings) as approximating this rigorous Bayesian procedure? This could probably be quantified using various tests. I don’t dispute that one couldn’t lay claim to mathematical rigor, but I’m not sure that means that any human assignment of numerical probabilities is meaningless.
Yes, with good enough calibration, it does make sense. If you have an assembly line worker whose job is to notice and remove defective items, and he’s been doing it with a steady (say) 99.7% accuracy for a long time, it makes sense to assign p=0.997 to each single judgment he makes about an individual item, and this number can be of practical value in managing production. However, this doesn’t mean that you could improve the worker’s performance by teaching him about Bayesianism; his brain remains a black box. The important point is that the same typically holds for highbrow intellectual tasks too.
Moreover, for the great majority of interesting questions about the world, we don’t have the luxury of a large reference class of trials on which to calibrate. Take for example the recent discussion about the AD-36 virus controversy. If you look at the literature, you’ll presumably form an opinion about this question with a higher or lower certainty, depending on how much confidence you have in your own ability to judge about such matters. But how to calibrate this judgment in order to arrive at a probability estimate? There is no way.
To try to understand your point, I will try to clarify it.
We have very limited access to our mental processes. In fact, in some cases our access to our mental processes is indirect—that is, we only discover what we believe once we have observed how we act. We observe our own act, and from this we can infer that we must have believed such-and-such. We can attempt to reconstruct our own process of thinking, but the process we are modeling is essentially a black box whose internals we are modeling, and the outputs of the black box at any given time are meager. We are of course always using the black box, which gives us a lot of data to go on in an absolute sense, but since the topic is constantly changing and since our beliefs are also in flux, the relevance of most of that data to the correct understanding of a particular act of thinking is unclear. In modeling our own mental processes we are rationalizing, with all the potential pitfalls associated with rationalization.
Nevertheless, this does not stop us from using the familiar gambling method for eliciting probability assessments, understood as willingness to wager. The gambling method, even if it is artificial, is at least reasonable, because every behavior we exhibit involves a kind of wager. However the black box operates, it will produce a certain response for each offered betting odds, from which its probability assignments can be derived. Of course this won’t work if the black box produces inconsistent (i.e. Dutch bookable) responses to the betting odds, but whether and to what degree it does or not is an empirical question. As a matter of fact, you’ve been talking about precision, and I think here’s how we can define the precision of your probability assignment. I’m sure that the black box’s responses to betting odds will be somewhat inconsistent. We can measure how inconsistent they are. There will be a certain gap of a certain size which can be Dutch booked—the bigger the gap the quicker you can be milked. And this will be the measure of the precision of your probability assignment.
But suppose that a person always in effect bets for something given certain odds or above, in whatever manner the bet is put to him, and always bets against if given odds anywhere below, and suppose the cutoff between his betting for and against is some very precise number such as pi to twelve digits. Then that seems to say that the odds his black box assigns is precisely those odds.
You write:
But I don’t we should be looking at introspectable “output”. The purpose of the brain isn’t to produce rough and vague feelings which we can then appreciate through inner contemplation. The purpose of the brain is to produce action, to decide on a course of action and then move the muscles accordingly. Our introspective power is limited at best. Over a lifetime of knowing ourselves we can probably get pretty good at knowing our own beliefs, but I don’t thing we should think of introspection as the gold standard of measuring a person’s belief. Like preference, belief is revealed in action. And action is what the gambling method of eliciting probability assignments looks at. While the brain produces only rough and vague feelings of certainty for the purposes of one’s own navel-gazing, at the same time it produces very definite behavior, very definite decisions, from which can be derived, at least in principle, probability assignments—and also, as I mention above, precision of those probability assignments.
I grant, by implication, that one’s own probability assignments are not necessarily introspectable. That goes without saying.
You write:
Your first described way takes the vague feeling for the output of the black box. But the purpose of the black box is action, decision, and that is the output that we should be looking at, and it’s the output that the gambling method looks at. And that is a third way of arriving at a numerical probability which you didn’t cover.
Aside from some quibbles that aren’t really worth getting into, I have no significant disagreement with your comments. There is nothing wrong with looking at people’s acts in practice and observing that they behave as if they operated with subjective probability estimates in some range. However, your statement that “one’s own probability assignments are not necessarily introspectable” basically restates my main point, which was exactly about the meaninglessness of analyzing one’s own common-sense judgments to arrive at a numerical probability estimate, which many people here, in contrast, consider to be the right way to increase the accuracy of one’s thinking. (Though I admit that it should probably be worded more precisely to make sure it’s interpreted that way.)
As it happens, early on I voted your initial comment down (following the topsy-turvy rules of the main post) because based on my first impression I thought I agreed with you. Reconsideration of your comment in light of the ensuing discussion brought to my mind this seeming objection. But you have disarmed the objection, so I am back to agreement.