In other words, I think it’s more useful to think of those definitions as an algorithm (perhaps ML): certainty ~ f(risk, uncertainty); and the definitions provided of the driving factors as initial values. The users can then refine their threshold to improve the model’s prediction capability over time, but also as a function of the class of problems (i.e. climate vs software).
I think I agree with substantial parts of both the spirit and specifics of what you say. And your comments have definitely furthered my thinking, and it’s quite possible I’d now write this quite differently, were I to do it again. But I also think you’re perhaps underestimating the extent to which risk vs uncertainty very often is treated as an absolute dichotomy, with substantial consequences. I’ll now attempt to lay out my thinking in response to your comments, but I should note that my goal isn’t really to convince you of “my side”, and I’d consider it a win to be convinced of why my thinking is wrong (because then I’ve learned something, and because that which can be destroyed by the truth should be, and all that).
For the most part, you seem to spend a lot of time trying to discover whether terms like unknown probability and known probability make sense. Yet, those are language artifacts which, like everything language, is merely a use of a clarification algorithm as means to communicate abstractions. Each class represents primarily its dominating modes, but becomes increasingly useless at the margins.
From memory, I think I agreed with basically everything in Eliezer’s sequence A Human’s Guide to Words. One core point from that seems to closely match what you’re saying:
The initial clue only has to lead the user to the similarity cluster—the group of things that have many characteristics in common. After that, the initial clue has served its purpose, and I can go on to convey the new information “humans are currently mortal”, or whatever else I want to say about us featherless bipeds.
A dictionary is best thought of, not as a book of Aristotelian class definitions, but a book of hints for matching verbal labels to similarity clusters, or matching labels to properties that are useful in distinguishing similarity clusters.
And it’s useful to have words to point to clusters in thingspace, because it’d be far too hard to try to describe, for example, a car on the level of fundamental physics. So instead we use labels and abstractions, and accept there’ll be some fuzzy boundaries and edge cases (e.g., some things that are sort of like cars and sort of like trucks).
One difference worth noting between that example and the labels “risk” and “uncertainty” is that risk and uncertainty are like two different “ends” or “directions” of a single dimension in thingspace. (At least, I’d argue they are, and it’s possible that that has to be based on a Bayesian interpretation of probability.) So here it seems to me it’d actually be very easy to dispense with having two different labels. Instead, we can just have one for the dimension as a whole (e.g., “trustworthy”, “well-grounded”, “resilient”; see here), and then use that in combination with “more”, “less”, “extremely”, “hardly at all”, etc., and we’re done.
We can then very clearly communicate the part that’s real (that reflects the territory)from when we tried to talk about “risk” and “uncertainty”, without confusing ourselves into thinking that there’s some sharp line somewhere, or that it’s obvious a different strategy would be needed in “one case” than in “the other”. This is in contrast to the situation with cars, where it’d be much less useful to say “more car-y” or “less car-y”—do we mean along the size dimension, as compared to trucks? On the size dimension, as compared to mice? On the “usefulness for travelling in” dimension? On the “man made vs natural” dimension? It seems to me that it’s the high dimensionality of thingspace that means labels for clusters are especially useful and hard to dispense with—when we’re talking about two “regions” or whatever of a single dimension, the usefulness of separate labels is less clear.
That said, there are clearly loads of examples of using two labels for different points along a single dimension. E.g., short and tall, heavy and light. This is an obvious and substantial counterpoint to what I’ve said above.
But it also brings me to really my more central point, which is that people who claim real implications from a risk-uncertainty distinction typically don’t ever talk about “more risk-ish” or “more Knightian” situations, but rather just situations of “risk” or of “uncertainty”. (One exception is here.) And they make that especially clear when they say things like that we “completely know” or “completely cannot know” the probabilities, or we have “zero” knowledge, or things like that. In contrast, with height, it’s often useful to say “short” or “tall”, and assume a shared reference frame that makes it clear roughly what we mean by that (e.g., it’s different for buildings than for people), but we also very often say things like “more” or “less” tall, “shorter”, etc., and we never say “This person has zero height” or “This building is completely tall”, or the like, except perhaps to be purposefully silly.
So while I agree that words typically point to somewhat messy clusters in thingspace, I think there are a huge number of people who don’t realise (or agree with) that, and who seem to truly believe there’s a clear, sharp distinction between risk and uncertainty, and who draw substantial implications from that (e.g., that we need to use methods other than expected value reasoning, as discussed in this post, or that we should entirely ignore possibilities we can’t “have” probabilities about, an idea which the quote from Bostrom & Cirkovic points out the huge potential dangers of).
So one part of what you say that I think I do disagree with, if I’m interpreting you correctly, is “These quotes and definitions are using imagery to teach readers the basic classification system (i.e. the words) to the reader, by proposing initial but vague boundaries.” I really don’t think most writers who endorse the risk-uncertainty distinction think that that’s what they’re doing; I think they think they’re really pointing to two cleanly separable concepts. (And this seems reflected in their recommendations—they don’t typically refer to things like gradually shifting our emphasis from expected value reasoning to alternative approaches, but rather using one approach when we “have” probabilities and another when we “don’t have probabilities”, for example.)
And a related point is that, even though words typically point to somewhat messy clusters in thingspace, some words can be quite misleading and do a poor job of marking out meaningful clusters. This is another point Eliezer makes:
Any way you look at it, drawing a boundary in thingspace is not a neutral act. Maybe a more cleanly designed, more purely Bayesian AI could ponder an arbitrary class and not be influenced by it. But you, a human, do not have that option. Categories are not static things in the context of a human brain; as soon as you actually think of them, they exert force on your mind. One more reason not to believe you can define a word any way you like.
A related way of framing this is that you could see the term “Knightian uncertainty” as sneaking in connotations that this is a situation where we have to do something other than regular expected value reasoning, or where using any explicit probabilities would be foolish and wrong. So ultimately I’m sort-of arguing that we should taboo the terms “risk” (used in this sense) and “Knightian uncertainty”, and just speak in terms of how uncertain/resilient/trustworthy/whatever a given uncertainty is (or how wide the confidence intervals or error bars are, or whatever).
But what I’ve said could be seen as just indicating that the problem is that advocates of the risk-uncertainty distinction need to read the sequences—this is just one example of a broader problem, which has already been covered there. This seems similar to what you’re saying with:
So you get caught in a rabbit hole as you are essentially re-discovering the limitations of language and classification systems, rather than actually discussing the problem at hand. And the initial problem statement, in your analysis (i.e. certainty as risk+uncertainties) is arbitrary, and your logic could have been applied to any definition or concept.
I think there’s something to this, but I still see the risk-uncertainty distinction proposed in absolute terms, even on LessWrong and the EA Forum, so it seemed worth discussing it specifically. (Plus possibly the fact that this is a one-dimensional situation, so it seems less useful to have totally separate labels than it is in many other cases like with cars, trucks, tigers, etc.)
But perhaps even if it is worth discussing that specifically, I should’ve more clearly situated it in the terms established by that sequence—using some of those terms, perhaps changing my framing, adding some links. I think there’s something to this as well, and I’d probably do that if I was to rewrite this.
And something I do find troubling is the possibility that the way I’ve discussed these problems leans problematically on terms like “absolute, binary distinction”, which should really be tabooed and replaced by something more substantive. I think that the term “absolute, binary distinction” is sufficiently meaningful to be ok to be used here, but it’s possible that it’s just far, far more meaningful than the term “Knightian uncertainty”, rather than “absolutely” more meaningful. As you can probably tell, this particular point is one I’m still a bit confused about, and will have to think about more.
And the last point I’ll make relates to this:
Whether the idea of uncertainty+risk is the proper tool can essentially only analyzed empirically, by comparing it, for example, to another method used in a given field, and evaluating whether method A or B improve s the ability of planners to predict date/cost prediction (in software engineering, for example).
This is basically what my next post will do. It focuses on whether, in practice, the concept of a risk-uncertainty distinction is useful, whether or not it “truly reflects reality” or whatever. So I think that post, at least, will avoid the issues you perceive (at least partially correctly, in my view) in this one.
I’d be interested in your thoughts on these somewhat rambly thoughts of mine.
Update: I’ve now posted that “next post” I was referring to (which gets into whether the risk-uncertainty distinction is a useful concept, in practice).
In other words, I think it’s more useful to think of those definitions as an algorithm (perhaps ML): certainty ~ f(risk, uncertainty); and the definitions provided of the driving factors as initial values. The users can then refine their threshold to improve the model’s prediction capability over time, but also as a function of the class of problems (i.e. climate vs software).
I think I agree with substantial parts of both the spirit and specifics of what you say. And your comments have definitely furthered my thinking, and it’s quite possible I’d now write this quite differently, were I to do it again. But I also think you’re perhaps underestimating the extent to which risk vs uncertainty very often is treated as an absolute dichotomy, with substantial consequences. I’ll now attempt to lay out my thinking in response to your comments, but I should note that my goal isn’t really to convince you of “my side”, and I’d consider it a win to be convinced of why my thinking is wrong (because then I’ve learned something, and because that which can be destroyed by the truth should be, and all that).
From memory, I think I agreed with basically everything in Eliezer’s sequence A Human’s Guide to Words. One core point from that seems to closely match what you’re saying:
And it’s useful to have words to point to clusters in thingspace, because it’d be far too hard to try to describe, for example, a car on the level of fundamental physics. So instead we use labels and abstractions, and accept there’ll be some fuzzy boundaries and edge cases (e.g., some things that are sort of like cars and sort of like trucks).
One difference worth noting between that example and the labels “risk” and “uncertainty” is that risk and uncertainty are like two different “ends” or “directions” of a single dimension in thingspace. (At least, I’d argue they are, and it’s possible that that has to be based on a Bayesian interpretation of probability.) So here it seems to me it’d actually be very easy to dispense with having two different labels. Instead, we can just have one for the dimension as a whole (e.g., “trustworthy”, “well-grounded”, “resilient”; see here), and then use that in combination with “more”, “less”, “extremely”, “hardly at all”, etc., and we’re done.
We can then very clearly communicate the part that’s real (that reflects the territory) from when we tried to talk about “risk” and “uncertainty”, without confusing ourselves into thinking that there’s some sharp line somewhere, or that it’s obvious a different strategy would be needed in “one case” than in “the other”. This is in contrast to the situation with cars, where it’d be much less useful to say “more car-y” or “less car-y”—do we mean along the size dimension, as compared to trucks? On the size dimension, as compared to mice? On the “usefulness for travelling in” dimension? On the “man made vs natural” dimension? It seems to me that it’s the high dimensionality of thingspace that means labels for clusters are especially useful and hard to dispense with—when we’re talking about two “regions” or whatever of a single dimension, the usefulness of separate labels is less clear.
That said, there are clearly loads of examples of using two labels for different points along a single dimension. E.g., short and tall, heavy and light. This is an obvious and substantial counterpoint to what I’ve said above.
But it also brings me to really my more central point, which is that people who claim real implications from a risk-uncertainty distinction typically don’t ever talk about “more risk-ish” or “more Knightian” situations, but rather just situations of “risk” or of “uncertainty”. (One exception is here.) And they make that especially clear when they say things like that we “completely know” or “completely cannot know” the probabilities, or we have “zero” knowledge, or things like that. In contrast, with height, it’s often useful to say “short” or “tall”, and assume a shared reference frame that makes it clear roughly what we mean by that (e.g., it’s different for buildings than for people), but we also very often say things like “more” or “less” tall, “shorter”, etc., and we never say “This person has zero height” or “This building is completely tall”, or the like, except perhaps to be purposefully silly.
So while I agree that words typically point to somewhat messy clusters in thingspace, I think there are a huge number of people who don’t realise (or agree with) that, and who seem to truly believe there’s a clear, sharp distinction between risk and uncertainty, and who draw substantial implications from that (e.g., that we need to use methods other than expected value reasoning, as discussed in this post, or that we should entirely ignore possibilities we can’t “have” probabilities about, an idea which the quote from Bostrom & Cirkovic points out the huge potential dangers of).
So one part of what you say that I think I do disagree with, if I’m interpreting you correctly, is “These quotes and definitions are using imagery to teach readers the basic classification system (i.e. the words) to the reader, by proposing initial but vague boundaries.” I really don’t think most writers who endorse the risk-uncertainty distinction think that that’s what they’re doing; I think they think they’re really pointing to two cleanly separable concepts. (And this seems reflected in their recommendations—they don’t typically refer to things like gradually shifting our emphasis from expected value reasoning to alternative approaches, but rather using one approach when we “have” probabilities and another when we “don’t have probabilities”, for example.)
And a related point is that, even though words typically point to somewhat messy clusters in thingspace, some words can be quite misleading and do a poor job of marking out meaningful clusters. This is another point Eliezer makes:
A related way of framing this is that you could see the term “Knightian uncertainty” as sneaking in connotations that this is a situation where we have to do something other than regular expected value reasoning, or where using any explicit probabilities would be foolish and wrong. So ultimately I’m sort-of arguing that we should taboo the terms “risk” (used in this sense) and “Knightian uncertainty”, and just speak in terms of how uncertain/resilient/trustworthy/whatever a given uncertainty is (or how wide the confidence intervals or error bars are, or whatever).
But what I’ve said could be seen as just indicating that the problem is that advocates of the risk-uncertainty distinction need to read the sequences—this is just one example of a broader problem, which has already been covered there. This seems similar to what you’re saying with:
I think there’s something to this, but I still see the risk-uncertainty distinction proposed in absolute terms, even on LessWrong and the EA Forum, so it seemed worth discussing it specifically. (Plus possibly the fact that this is a one-dimensional situation, so it seems less useful to have totally separate labels than it is in many other cases like with cars, trucks, tigers, etc.)
But perhaps even if it is worth discussing that specifically, I should’ve more clearly situated it in the terms established by that sequence—using some of those terms, perhaps changing my framing, adding some links. I think there’s something to this as well, and I’d probably do that if I was to rewrite this.
And something I do find troubling is the possibility that the way I’ve discussed these problems leans problematically on terms like “absolute, binary distinction”, which should really be tabooed and replaced by something more substantive. I think that the term “absolute, binary distinction” is sufficiently meaningful to be ok to be used here, but it’s possible that it’s just far, far more meaningful than the term “Knightian uncertainty”, rather than “absolutely” more meaningful. As you can probably tell, this particular point is one I’m still a bit confused about, and will have to think about more.
And the last point I’ll make relates to this:
This is basically what my next post will do. It focuses on whether, in practice, the concept of a risk-uncertainty distinction is useful, whether or not it “truly reflects reality” or whatever. So I think that post, at least, will avoid the issues you perceive (at least partially correctly, in my view) in this one.
I’d be interested in your thoughts on these somewhat rambly thoughts of mine.
Update: I’ve now posted that “next post” I was referring to (which gets into whether the risk-uncertainty distinction is a useful concept, in practice).