Ways of describing the “trustworthiness” of probabilities
While doing research for a post on the idea of a distinction between “risk” and “(Knightian) uncertainty”, I came across a surprisingly large number of different ways of describing the idea that some probabilities may be more or less “reliable”, “trustworthy”, “well-grounded”, etc. than others, or things along those lines. (Note that I’m referring to the idea of different degrees of trustworthiness-or-whatever, rather than two or more fundamentally different types of probability that vary in trustworthiness-or-whatever.)
I realised that it might be valuable to write a post collecting all of these terms/concepts/framings together, analysing the extent to which some may be identical to others, highlighting ways in which they may differ, suggesting ways or contexts in which some of the concepts may be superior to others, etc.[1] But there’s already too many things I’m working on writing at the moment, so this is a low effort version of that idea—this is basically just a collection of the concepts, relevant quotes, and links where readers can find more.
Comments on this post will inform whether I take the time to write something more substantial/useful on this topic later (and, if so, precisely what and how).
Note that this post does not explicitly cover the “risk vs uncertainty” framing itself, as I’m already writing a separate, more thorough post on that.
Epistemic credentials
Dominic Roser speaks of how “high” or “low” the epistemic credentials of our probabilities are. He writes:
The expression ‘‘epistemic credentials of probabilities’’ is a shorthand for two things: First, it refers to the credentials of the epistemic access to the probabilities: Are our beliefs about the probabilities well-grounded? Second—and this applies only to the case of subjective probabilities—it refers to the credentials of the probabilities themselves: Are our subjective probabilities—i.e. our degrees of belief—well-grounded?
He further explains what he means by this in a passage that also alludes to many other ways of describing or framing an idea along the lines of the trustworthiness of given probabilities:
What does it mean for probabilities to have (sufficiently) high epistemic credentials? It could for example mean that we can calculate or reliably estimate the probabilities (Keynes 1937, p. 214; Gardiner 2010, p. 7; Shue 2010, p. 148) rather than just guesstimate them; it could mean that our epistemic access allows for unique, numerical or precise probabilities (Kelsey and Quiggin 1992, p. 135; Friedman 1976, p. 282; Kuhn 1997, p. 56) rather than for qualitative and vague character- izations of probabilities or for ranges of probabilities; or it could mean that our epistemic access allows for knowledge of probabilities, in particular for knowledge that is certain, or which goes beyond the threshold of being extremely insecure, or which is not only based on a partial theory that is only valid ceteris paribus (Hansson 2009, p. 426; Rawls 1999, p. 134; Elster 1983, p. 202).
These examples from the literature provide different ways of spelling out the idea that our epistemic situation with regard to the probabilities must be of sufficient quality before we can properly claim to have probabilities. I will not focus on any single one of those ways. I am only concerned with the fact that they are all distinct from the idea that the mere existence of probabilities and mere epistemic access, however minimal, is sufficient. This second and narrower way of understanding ‘‘having probabilities’’ seems quite common for distinguishing risk from uncertainty. For example, in his discussion of uncertainty, Gardiner (2006, p. 34), based on Rawls (1999, p. 134), speaks of lacking, or having reason to sharply discount, information about probabilities, Peterson (2009, p. 6) speaks of it being virtually impossible to assign probabilities, and Bognar (2011, p. 331) says that precautionary measures are warranted whenever the conditions that Rawls described are approximated. This indicates that in order to distinguish risk from uncertainty, these authors do not examine whether we have probabilities at all, but rather whether we have high-credentials probabilities rather than low-credentials probabilities.
Note also that some believe that scientific progress can turn contexts of uncertainty into contexts of risk. For example, in the third assessment report the IPCC gave a temperature range but it did not indicate the probability of staying within this range. In the fourth assessment report, probabilities were added to the range. If one believes that scientific progress can move us from uncertainty to risk, this indicates as well that one’s risk-uncertainty distinction is not about the sheer availability of probabilities, i.e. having probabilities simpliciter. Given the gradual progress of science, it would be surprising if, after some time, probabilities suddenly became available at all. It seems more plausible that probabilities which were available all along changed from having hardly any credentials (in which case we might call them hunches) to gradually having more credentials. And when they cross some threshold of credentials, then—so I interpret parts of the literature—we switch from uncertainty to risk and we can properly claim to have probabilities. (line breaks added)
Resilience (of credences)
Amanda Askell discusses the idea that we can have “more” or “less” resilient credences[2] in this talk and this book chapter.
From the talk:
if I thought there was a 50% chance that I would get $100, there’s actually a difference between a low resilience 50% and a high resilience 50%.
I’m going to argue that, if your credences are low resilience, then the value of information in this domain is generally higher than it would be in a domain where your credences are high resilience. And, I’m going to argue that this means that actually in many cases, we should prefer interventions with less evidential support, all else being equal.
[...] One kind of simple formulation of resilience [...] is that credo-resilience is how stable you expect your credences to be in response to new evidence. If my credences are high resilience, then there’s more stability. I don’t expect them to vary that much as new evidence comes in, even if the evidence is good and pertinent to the question. If they’re low resilience, then they have low stability. I expect them to change a little in response to new evidence. That’s true in the case of the untested coin, where I just have no data about how good it is, so the resilience of my credence of 50% is fairly low.
It’s worth noting that resilience levels can reflect either the set of evidence that you have about a proposition, or your prior about the proposition. So, if it’s just incredibly plausible that the coins are generally fair. For example, if you saw me simply pick the coin up out of a stack of otherwise fair coins, in this case you would have evidence that it’s fair. But if you simply live in a world that doesn’t include a lot of very biased coins, then your prior might be doing a lot of the work that your evidence would otherwise do. These are the two things that generate credo-resilience.
In both cases, with the coin, your credence that the coin will land heads on the next flip is the same, it’s 0.5. Your credence of 0.5 about the tested coin is resilient, because you’ve done a million trials of this coin. Whereas, your credence about the untested coin is quite fragile. It could easily move in response to new evidence, as we see here.
Later in the talk, Askell highlights an implication of this idea, and how it differs from the idea of just not having precise probabilities at all:
A lot of people seems to be kind of unwilling to assert probability estimates about whether something is going to work or not. I think a really good explanation for this is that, in cases where we don’t have a lot of evidence, our credences about how good our credences are, are fairly low.
We basically think it’s really likely that we’re going to move around a lot in response to new evidence. We’re just not willing to assert a credence that we think is just going to be false, or inaccurate once we gain a little bit more evidence. Sometimes people think you have mushy credences, that you don’t actually have precise probabilities that you can assign to claims like, “This intervention is effective to Degree N.” I actually think resilience might be a good way of explaining that away, to say, “No. You can have really precise estimates. You just aren’t willing to assert them.”
(This comment thread seems to me to suggest that the term “robustness of credences” may mean the same thing as “resilience of credences”, but I’m not sure about that.)
Evidential weight (balance vs weight of evidence)
In the book chapter linked to above, Askell also discusses the idea of evidential weight (or the idea of the weight of the evidence, as opposed to the balance of evidence). This seems quite similar to the idea of credence resilience.
The balance of the evidence refers to how decisively the evidence supports the proposition. The weight of the evidence is the total amount of relevant evidence that we have.
Since I can’t easily copy and paste from that chapter, for further info see pages 39-41 of that chapter (available in the preview I linked to).
Probability distributions (and confidence intervals)
Policy choice can still do justice to precautionary intuitions even when making use of probabilities. I submit that what drives precautionary intuitions is that in cases where there is little and unreliable evidence, our subjective probability distributions should exhibit a larger spread around the best guess. These spread out probability distributions yield precautionary policy-making when they are combined with, for example, the general idea of diminishing marginal utility (Stern 2007, p. 38) or the idea that an equal probability of infringing our descendants’ rights and bequeathing more to them than we owe them does not cancel each other out (Roser and Seidel 2017, p. 82). (emphasis added)
we are bounded reasoners, and we usually can’t consider all available hypotheses. [...]
Bounded Bayesian reasoners should expect that they don’t have access to the full hypothesis space. Bounded Bayesian reasoners can expect that their first-order predictions are incorrect due to a want of the right hypothesis, and thus place high credence on “something I haven’t thought of”, and place high value on new information or other actions that expand their hypothesis space. Bounded Bayesians can even expect that their credence for an event will change wildly as new information comes in.
[...] if I expect that I have absolutely no idea what the black swans will look like but also have no reason to believe black swans will make this event any more or less likely, then even though I won’t adjust my credence further, I can still increase the variance of my distribution over my future credence for this event.
In other words, even if my current credence is 50% I can still expect that in 35 years (after encountering a black swan or two) my credence will be very different. This has the effect of making me act uncertain about my current credence, allowing me to say “my credence for this is 50%” without much confidence. So long as I can’t predict the direction of the update, this is consistent Bayesian reasoning. (emphasis added)
A good, quick explanation, accompanied by diagrams, can be found in this comment.
Precision, sharpness, vagueness
(These ideas seem closely related to the ideas of probability distributions and confidence intervals [above], and to the concept of haziness [below].)
Sometimes one’s evidence for a proposition is sharp. For example: You’ve tossed a biased coin thousands of times. 83% of the tosses landed heads, and no pattern has appeared even though you’ve done a battery of statistical tests. Then it is clear that your confidence that the next toss will land heads should be very close to 83%.
Sometimes one’s evidence for a proposition is sparse but with a clear upshot. For example: You have very little evidence as to whether the number of humans born in 1984 was even. But it is clear that you should be very near to 50% confident in this claim.
But sometimes one’s evidence for a proposition is sparse and unspecific. For example: A stranger approaches you on the street and starts pulling out objects from a bag. The first three objects he pulls out are a regular-sized tube of toothpaste, a live jellyfish, and a travel-sized tube of toothpaste. To what degree should you believe that the next object
he pulls out will be another tube of toothpaste?
[...]
It is very natural in such cases to say: You shouldn’t have any very precise degree of confidence in the claim that the next object will be toothpaste.It is very natural to say: Your degree of belief should be indeterminate or vague or interval-valued. On this way of thinking, an appropriate response to this evidence would be a degree of confidence represented not by a single number, but rather by a range of numbers. The idea is that your probability that the next object is toothpaste should not equal 54%, 91%, 18%, or any other particular number. Instead it should span an interval of values, such as [10%, 80%]. (emphasis added)
Elga then quotes various authors making claims along those lines, and writes:
These authors all agree that one’s evidence can make it downright unreasonable to have sharp degrees of belief. The evidence itself may call for unsharp degrees of belief, and this has nothing to do with computational or representational limitations of the believer. Let me write down a very cautious version of this claim:
UNSHARP: It is consistent with perfect rationality that one have unsharp degrees of belief.
However, Elga spends the rest of the paper arguing against this claim, and arguing instead (based on a type of Dutch book argument) for the following claim:
SHARP: Perfect rationality requires one to have sharp degrees of belief.
(Elga’s arguments seem sound to me, but I think they still allow for representing our beliefs as probability distributions that do have some mean or central or whatever value, and then using that value in many of the contexts Elga talks about. Thus, in those contexts, we’d act as if we have a “sharp degree of belief”, but we could still be guided by the shape and width of our probability distributions when thinking about things like how valuable additional information would be. But I’m not an expert on these topics, and haven’t thought about this stuff in depth.)
Consider a handful of statements that involve probabilities:
A hypothetical fair coin tossed in a fair manner has a 50% chance of coming up heads.
When two buddies at a bar flip a coin to decide who buys the next round, each person has a 50% chance of winning.
Experts believe there’s a 20% chance the cost of a gallon of gasoline will be higher than $3.00 by this time next year.
Dr. Paulson thinks there’s an 80% chance that Moore’s Law will continue to hold over the next 5 years.
Dr. Johnson thinks there’s a 20% chance quantum computers will commonly be used to solve everyday problems by 2100.
Kyle is an atheist. When asked what odds he places on the possibility that an all-powerful god exists, he says “2%.”
I’d argue that the degree to which probability is a useful tool for understanding uncertainty declines as you descend the list.
The first statement is tautological. When I describe something as “fair,” I mean that it perfectly conforms to abstract probability theory.
In the early statements, the probability estimates can be informed by past experiences with similar situations and explanatory theories.
In the final statement, I don’t know what to make of the probability estimate.
The hypothetical atheist from the final statement, Kyle, wouldn’t be able to draw on past experiences with different realities (i.e., Kyle didn’t previously experience a bunch of realities and learn that some of them had all-powerful gods while others didn’t). If you push someone like Kyle to explain why they chose 2% rather than 4% or 0.5%, you almost certainly won’t get a clear explanation.
If you gave the same “What probability do you place on the existence of an all-powerful god?” question to a number of self-proclaimed atheists, you’d probably get a wide range of answers.
I bet you’d find that some people would give answers like 10%, others 1%, and others 0.001%. While these probabilities can all be described as “low,” they differ by orders of magnitude. If probabilities like these are used alongside probabilistic decision models, they could have extremely different implications. Going forward, I’m going to call probability estimates like these “hazy probabilities.”
Placing hazy probabilities on the same footing as better-grounded probabilities (e.g., the odds a coin comes up heads) can lead to problems. (bolding added)
Hyperpriors, credal sets, and other things I haven’t really learned about
Bayesian approaches to probability treat it as a degree of belief and thus they do not draw a distinction between risk and a wider concept of uncertainty: they deny the existence of Knightian uncertainty. They would model uncertain probabilities with hierarchical models, i.e. where the uncertain probabilities are modelled as distributions whose parameters are themselves drawn from a higher-level distribution(hyperpriors). (emphasis added)
I haven’t looked into this and don’t properly understand it, so I won’t say more about it here, but I think it’s relevant. (This also might be related to the idea of confidence intervals mentioned earlier; as stated at the top, this is a low-effort version of this post where I’m not really trying to explain how the different framings might overlap or differ.)
I hope you found this somewhat useful. As stated earlier, comments on this post will inform whether I take the time to write something more substantial/useful on this topic later (and, if so, precisely what and how).
Also, if you know of another term/concept/framing that’s relevant, please add a comment mentioning it, to expand the collection here.
It’s possible that something like this has already been done—I didn’t specifically check if it had been done before. If you know of something like this, please comment or message me a link to it.
I think that credences are essentially a subtype of probabilities, and that the fact that Askell uses that term rather than probability doesn’t indicate (a) that we can’t use the term “robustness” in relation to probabilities, or (b) that we can’t use the other terms covered in this post in relation to credences. But I haven’t thought about that in depth.
Appendix D of this report informed a lot of work we did on this, and in decreasing order of usefulness, it lists Shafer’s “Belief functions,” Possibility Theory, and the “Dezert-Smarandache Theory of Plausible and Paradoxical Reasoning.” I’d add “Fuzzy Sets” / “Fuzzy Logic.”
(Note that these are all formalisms in academic writing that predate and anticipate most of what you’ve listed above, but are harder ways to understand it. Except DST, which is hard to justify except as trying to be exhaustive about what people might want to think about non-probability belief.)
I like this and would find a post moderately valuable. I think sometimes posts with a lot of synonyms are hard to have take aways from, because it’s hard to remember all the synonyms. What I think is useful is comparing and contrasting the different takes, creating a richer view of the whole framework by examining it from many angles.
I think sometimes posts with a lot of synonyms are hard to have take aways from, because it’s hard to remember all the synonyms. What I think is useful is comparing and contrasting the different takes, creating a richer view of the whole framework by examining it from many angles.
Yeah, I’d agree with that, and it’s part of why fleshing this out is currently low priority for me (since the latter approach takes actual work!), but remains theoretically on the list :)
There are “reliabilist” accounts of what makes a credence justified. There are different accounts, but they say (very roughly) that a credence is justified if it is produced by a process that is close to the truth on average. See (this paper)[https://philpapers.org/rec/PETWIJ-2].
Frequentist statistics can be seen as a version of reliabilism. Criteria like the Brier score for evaluating forecasters can also be understood in a reliabilist framework.
To add to your list—Subjective Logic represents opinions with three values: degree of belief, degree of disbelief, and degree of uncertainty. One interpretation of this is as a form of second-order uncertainty. It’s used for modelling trust. A nice summary here with interactive tools for visualising opinions and a trust network.
Ways of describing the “trustworthiness” of probabilities
While doing research for a post on the idea of a distinction between “risk” and “(Knightian) uncertainty”, I came across a surprisingly large number of different ways of describing the idea that some probabilities may be more or less “reliable”, “trustworthy”, “well-grounded”, etc. than others, or things along those lines. (Note that I’m referring to the idea of different degrees of trustworthiness-or-whatever, rather than two or more fundamentally different types of probability that vary in trustworthiness-or-whatever.)
I realised that it might be valuable to write a post collecting all of these terms/concepts/framings together, analysing the extent to which some may be identical to others, highlighting ways in which they may differ, suggesting ways or contexts in which some of the concepts may be superior to others, etc.[1] But there’s already too many things I’m working on writing at the moment, so this is a low effort version of that idea—this is basically just a collection of the concepts, relevant quotes, and links where readers can find more.
Comments on this post will inform whether I take the time to write something more substantial/useful on this topic later (and, if so, precisely what and how).
Note that this post does not explicitly cover the “risk vs uncertainty” framing itself, as I’m already writing a separate, more thorough post on that.
Epistemic credentials
Dominic Roser speaks of how “high” or “low” the epistemic credentials of our probabilities are. He writes:
He further explains what he means by this in a passage that also alludes to many other ways of describing or framing an idea along the lines of the trustworthiness of given probabilities:
Resilience (of credences)
Amanda Askell discusses the idea that we can have “more” or “less” resilient credences[2] in this talk and this book chapter.
From the talk:
Later in the talk, Askell highlights an implication of this idea, and how it differs from the idea of just not having precise probabilities at all:
(This comment thread seems to me to suggest that the term “robustness of credences” may mean the same thing as “resilience of credences”, but I’m not sure about that.)
Evidential weight (balance vs weight of evidence)
In the book chapter linked to above, Askell also discusses the idea of evidential weight (or the idea of the weight of the evidence, as opposed to the balance of evidence). This seems quite similar to the idea of credence resilience.
Since I can’t easily copy and paste from that chapter, for further info see pages 39-41 of that chapter (available in the preview I linked to).
Probability distributions (and confidence intervals)
Roser writes:
And Nate Soares writes:
A good, quick explanation, accompanied by diagrams, can be found in this comment.
Precision, sharpness, vagueness
(These ideas seem closely related to the ideas of probability distributions and confidence intervals [above], and to the concept of haziness [below].)
In this paper, Adam Elga writes:
Elga then quotes various authors making claims along those lines, and writes:
However, Elga spends the rest of the paper arguing against this claim, and arguing instead (based on a type of Dutch book argument) for the following claim:
(Elga’s arguments seem sound to me, but I think they still allow for representing our beliefs as probability distributions that do have some mean or central or whatever value, and then using that value in many of the contexts Elga talks about. Thus, in those contexts, we’d act as if we have a “sharp degree of belief”, but we could still be guided by the shape and width of our probability distributions when thinking about things like how valuable additional information would be. But I’m not an expert on these topics, and haven’t thought about this stuff in depth.)
See also the Wikipedia article on imprecise probability.
Haziness
Chris Smith (I believe that’s their name, based on this post) writes:
Hyperpriors, credal sets, and other things I haven’t really learned about
Wikipedia says:
I haven’t looked into this and don’t properly understand it, so I won’t say more about it here, but I think it’s relevant. (This also might be related to the idea of confidence intervals mentioned earlier; as stated at the top, this is a low-effort version of this post where I’m not really trying to explain how the different framings might overlap or differ.)
The ideas of a credal set and of robust Bayesian analysis also seem relevant, but I have extremely limited knowledge on those topics.
I hope you found this somewhat useful. As stated earlier, comments on this post will inform whether I take the time to write something more substantial/useful on this topic later (and, if so, precisely what and how).
Also, if you know of another term/concept/framing that’s relevant, please add a comment mentioning it, to expand the collection here.
It’s possible that something like this has already been done—I didn’t specifically check if it had been done before. If you know of something like this, please comment or message me a link to it.
I think that credences are essentially a subtype of probabilities, and that the fact that Askell uses that term rather than probability doesn’t indicate (a) that we can’t use the term “robustness” in relation to probabilities, or (b) that we can’t use the other terms covered in this post in relation to credences. But I haven’t thought about that in depth.
Appendix D of this report informed a lot of work we did on this, and in decreasing order of usefulness, it lists Shafer’s “Belief functions,” Possibility Theory, and the “Dezert-Smarandache Theory of Plausible and Paradoxical Reasoning.” I’d add “Fuzzy Sets” / “Fuzzy Logic.”
(Note that these are all formalisms in academic writing that predate and anticipate most of what you’ve listed above, but are harder ways to understand it. Except DST, which is hard to justify except as trying to be exhaustive about what people might want to think about non-probability belief.)
See also Open Philanthropy Project’s list of different kinds of uncertainty (and comments on how we might deal with them) here.
I like this and would find a post moderately valuable. I think sometimes posts with a lot of synonyms are hard to have take aways from, because it’s hard to remember all the synonyms. What I think is useful is comparing and contrasting the different takes, creating a richer view of the whole framework by examining it from many angles.
Re Knightian Uncertainty vs. Risk, I wrote a post that discusses the interaction of different types of risks (including knightian) here: https://www.lesswrong.com/posts/eA9a5fpi6vAmyyp74/how-to-understand-and-mitigate-risk
Thanks for the feedback!
Yeah, I’d agree with that, and it’s part of why fleshing this out is currently low priority for me (since the latter approach takes actual work!), but remains theoretically on the list :)
There are “reliabilist” accounts of what makes a credence justified. There are different accounts, but they say (very roughly) that a credence is justified if it is produced by a process that is close to the truth on average. See (this paper)[https://philpapers.org/rec/PETWIJ-2].
Frequentist statistics can be seen as a version of reliabilism. Criteria like the Brier score for evaluating forecasters can also be understood in a reliabilist framework.
Adam Binks replied to this list on the EA Forum with: