I would NOT have serious disagreements with e.g. Vaniver’s list.
I think they would have significant practical disagreement with #3, given the widespread use of NHST, but clever frequentists are as quick as anyone else to point out that NHST doesn’t actually do what its users want it to do.
Sure, I would quibble about accents, importances, and priorities, but there’s nothing there that would be unacceptable from the mainstream point of view.
Hence the importance of the qualifier ‘qualitative’; it seems to me that accents, importances, and priorities are worth discussing, especially if you’re interested in changing System 1 thinking instead of System 2 thinking. The mainstream frequentist thinks that base rate neglect is a mistake, but the Bayesian both thinks that base rate neglect is a mistake and has organized his language to make that mistake obvious when it occurs. If you take revealed preferences seriously, it looks like the frequentist says base rate neglect is a mistake but the Bayesian lives that base rate neglect is a mistake.
Now, why Bayes specifically? I would be happy to point to Laplace instead of Bayes, personally, since Laplace seems to have been way smarter and a superior rationalist. But the trouble with naming methods of “thinking correctly” is that everyone wants to name their method “thinking correctly,” and so you rapidly trip over each other. “Rationalism,” for example, refers to a particular philosophical position which is very different from the modal position here at LW. Bayes is useful as a marker, but it is not necessary to come to those insights by way of Bayes.
(I will also note that not disagreeing with something and discovering something are very different thresholds. If someone has a perspective which allows them to generate novel, correct insights, that perspective is much more powerful than one which merely serves to verify that insights are correct.)
Yeah, I said if I were pretend to be a frequentist—but that didn’t involve suddenly becoming dumb :-)
it seems to me that accents, importances, and priorities are worth discussing
I agree, but at this point context starts to matter a great deal. Are we talking about decision-making in regular life? Like, deciding which major to pick, who to date, what job offer to take? Or are we talking about some explicitly statistical environment where you try to build models, fit them, evaluate them, do out-of-sample forecasting, all that kind of things?
I think I would argue that recognizing biases (Tversky/Kahneman style) and trying to correct for them—avoiding them altogether seems too high a threshold—is different from what people call Bayesian approaches. The Bayesian way of updating on the evidence is part of “thinking correctly”, but there is much, much more than just that.
I think I would argue that recognizing biases (Tversky/Kahneman style) and trying to correct for them—avoiding them altogether seems too high a threshold—is different from what people call Bayesian approaches.
At least one (and I think several) of biases identified by Tversky and Kahneman is “people do X, a Bayesian would do Y, thus people are wrong,” so I think you’re overstating the difference. (I don’t know enough historical details to be sure, but I suspect Tversky and Kahneman might be an example of the Bayesian approach allowing someone to discover novel, correct insights.)
The Bayesian way of updating on the evidence is part of “thinking correctly”, but there is much, much more than just that.
I agree, but it feels like we’re disagreeing. It seems to me that a major Less Wrong project is “thinking correctly,” and a major part of that project is “decision-making under uncertainty,” and a major part of uncertainty is dealing with probabilities, and the Bayesian way of dealing with probabilities seems to be the best, especially if you want to use those probabilities for decision-making.
So it sounds to me like you’re saying “we don’t just need stats textbooks, we need Less Wrong.” I agree; that’s why I’m here as well as reading stats textbooks. But it also sounds to me like you’re saying “why are you naming this Less Wrong stuff after a stats textbook?” The easy answer is that it’s a historical accident, and it’s too late to change it now. Another answer I like better is that much of the Less Wrong stuff comes from thinking about and taking seriously the stuff from the stats textbook, and so it makes sense to keep the name, even if we’re moving to realms where the connection to stats isn’t obvious.
Hm… Let me try to unpack my thinking, in particular my terminology which might not match exactly the usual LW conventions. I think of:
Bayes theorem as a simple, conventional, and an entirely uncontroversial statistical procedure. If you ask a dyed-in-the-wool rabid frequentist whether the Bayes theorem is true he’ll say “Yes, of course”.
Bayesian statistics as an approach to statistics with three main features. First is the philosophical interpretation of (some) probability as subjective belief. Second is the focus on conditional probabilities. Third is the strong preferences for full (posterior) distributions as answers instead of point estimates.
Cognitive biases (aka the Kahneman/Tversky stuff) as certain distortions in the way our wetware processes information about reality, as well as certain peculiarities in human decision-making. Yes, a lot of it it is concerned with dealing with uncertainty. Yes, there is some synergy with Bayesian statistics. No, I don’t think this synergy is the defining factor here.
I understand that historically the in the LW community Bayesian statistics and cognitive biases were intertwined. But apart from historical reasons, it seems to me these are two different things and the degree of their, um, interpenetration is much overstated on LW.
it sounds to me like you’re saying “we don’t just need stats textbooks, we need Less Wrong.”
Well, we need for which purpose? For real-life decision making? -- sure, but then no one is claiming that stats textbooks are sufficient for that.
much of the Less Wrong stuff comes from thinking about and taking seriously the stuff from the stats textbook
Some, not much. I can argue that much of LW stuff comes from thinking logically and following chains of reasoning to their conclusion—or actually just comes from thinking at all instead of reacting instinctively / on the basis of a gut feeling or whatever.
I agree that thinking in probabilities is a very big step and it *is* tied to Bayesian statistics. But still it’s just one step.
I can argue that much of LW stuff comes from thinking logically … I agree that thinking in probabilities is a very big step
When contrasting LW stuff and mainstream rationality, I think the reliance on thinking in probabilities is a big part of the difference. (“Thinking logically,” for the mainstream, seems to be mostly about logic of certainty.) When labeling, it makes sense to emphasize contrasting features. I don’t think that’s the only large difference, but I see an argument (which I don’t fully endorse) that it’s the root difference.
(For example, consider evolutionary psychology, a moderately large part of LW. This seems like a field of science particularly prone to uncertainty, where “but you can’t prove X!” would often be a conversation-stopper. For the Bayesian, though, it makes sense to update in the direction of evo psych, even though it can’t be proven, which is then beneficial to the extent that evo psych is useful.)
When contrasting LW stuff and mainstream rationality, I think the reliance on thinking in probabilities is a big part of the difference. (“Thinking logically,” for the mainstream, seems to be mostly about logic of certainty.)
Yes, I think you’re right.
For the Bayesian, though, it makes sense to update in the direction of evo psych, even though it can’t be proven
Um, I’m not so sure about that. The main accusation against evolutionary psychology is that it’s nothing but a bunch of just-so stories, aka unfalsifiable post-hoc narratives. And a Bayesian update should be on the basis of evidence, not on the basis of an unverifiable explanation.
The main accusation against evolutionary psychology is that it’s nothing but a bunch of just-so stories, aka unfalsifiable post-hoc narratives.
It seems to me that if you think in terms of likelihoods, you look at a story and say “but the converse of this story has high enough likelihood that we can’t rule it out!” whereas if you think in terms of likelihood ratios, you say “it seems that this story is weakly more plausible than its converse.”
I’m thinking primarily of comments like this. I think it is a reasonable conclusion that anger seems to be a basic universal emotion because ancestors who had the ‘right’ level of anger reproduced more than those who didn’t. Boris just notes that it could be the case that anger is a byproduct of something else, but doesn’t note anything about the likelihood of anger being universal in a world where it is helpful (very high) and the likelihood of anger being universal in a world where it is neutral or unhelpful (very low). We can’t rule out anger being spurious, but asking to rule that out is mistaken, I think, because the likelihood ratio is so significant. It doesn’t make sense to bet against anger being reproductively useful in the ancestral environment (but I think it makes sense to assign a probability to that bet, even if it’s not obvious how one would resolve it).
It seems to me that if you think in terms of likelihoods, you look at a story and say “but the converse of this story has high enough likelihood that we can’t rule it out!” whereas if you think in terms of likelihood ratios, you say “it seems that this story is weakly more plausible than its converse.”
I have several problems with this line of reasoning. First, I am unsure what it means for a story to be true. It’s a story—it arranges a set of facts in a pattern pleasing to the human brain. Not contradicting any known facts is a very low threshold (see the Russell’s teapot), to call something “true” I’ll need more than that and if a story makes no testable predictions I am not sure on which basis I should evaluate its truth and what does it even mean.
Second, it seems to me that in such situations the likelihoods and so, necessarily, their ratios are very very fuzzy. My meta uncertainty—uncertainty about probabilities—is quite high. I might say “story A is weakly more plausible than story B” but my confidence in my judgment about plausibility is very low. This judgment might not be worth anything.
Third, likelihood ratios are good when you know you have a complete set of potential explanations. And you generally don’t. For open-ended problems the explanation “something else” frequently looks like the more plausible one, but again, the meta uncertainty is very high—not only you don’t know how uncertain you are, you don’t even know what you are uncertain about! Nassim Taleb’s black swans are precisely the beasties that appear out of “something else” to bite you in the ass.
First, I am unsure what it means for a story to be true.
Ah, by that I generally mean something like “the causal network N with a particular factorization F is the underlying causal representation of reality,” and so a particular experiment measures data and then we calculate “the aforementioned causal network would generate this data with probability P” for various hypothesized causal networks.
For situations where you can control at least one of the nodes, it’s easy to see how you can generate data useful for this. For situations where you only have observational data (like the history of human evolution, mostly), then it’s trickier to determine which causal network(s) is(are) best, but often still possible to learn quite a bit more about the underlying structure than is obvious at first glance.
So suppose we have lots of historical lives which are compressed down to two nodes, A which measures “anger” (which is integer-valued and non-negative, say) and C which measures “children” (which is also integer valued and non-negative). The story “anger is spurious” is the network where A and C don’t have a link between them, and the story “anger is reproductively useful” is the network where A->C and there is some nonzero value a^* of A which maximizes the expected value of C. If we see a relationship between A and C in the data, it’s possible that the relationship was generated by the “anger is spurious” network which said those variables were independent, but we can calculate the likelihoods and determine that it’s very very low, especially as we accumulate more and more data.
Third, likelihood ratios are good when you know you have a complete set of potential explanations. And you generally don’t.
Sure. But even if you’re only aware of two hypotheses, it’s still useful to use the LR to determine which to prefer; the supremacy of a third hidden hypothesis can’t swap the ordering of the two known hypotheses!
Nassim Taleb’s black swans are precisely the beasties that appear out of “something else” to bite you in the ass.
Yes, reversal effects are always possible, but I think that putting too much weight on this argument leads to Anton-Wilsonism (certainty is necessary but impossible). I think we do often have a good idea of what our meta uncertainty looks like in a lot of cases, and that’s generally enough to get the job done.
I have only glanced at Pearl’s work, not read it carefully, so my understanding of causal networks is very limited. But I don’t understand on the basis of which data will you construct the causal network for anger and children (and it’s actually more complicated because there are important society-level effects). In what will you “see a relationship between A and C”? On the basis of what will you be calculating the likelihoods?
In what will you “see a relationship between A and C”? On the basis of what will you be calculating the likelihoods?
Ideally, you would have some record. I’m not an expert in evo psych, so I can’t confidently say what sort of evidence they actually rely on. I was hoping more to express how I would interpret a story as a formal hypothesis.
I get the impression that a major technique in evolutionary psychology is making use of the selection effect due to natural selection: if you think that A is heritable, and that different values of A have different levels of reproductive usefulness, then in steady state the distribution of A in the population gives you information about the historic relationship between A and reproductive usefulness, without even measuring relationship between A and C in this generation. So you can ask the question “what’s the chance of seeing the cluster of human anger that we have if there’s not a relationship between A and reproduction?” and get answers that are useful enough to focus most of your attention on the “anger is reproductively useful” hypothesis.
I think they would have significant practical disagreement with #3, given the widespread use of NHST, but clever frequentists are as quick as anyone else to point out that NHST doesn’t actually do what its users want it to do.
Hence the importance of the qualifier ‘qualitative’; it seems to me that accents, importances, and priorities are worth discussing, especially if you’re interested in changing System 1 thinking instead of System 2 thinking. The mainstream frequentist thinks that base rate neglect is a mistake, but the Bayesian both thinks that base rate neglect is a mistake and has organized his language to make that mistake obvious when it occurs. If you take revealed preferences seriously, it looks like the frequentist says base rate neglect is a mistake but the Bayesian lives that base rate neglect is a mistake.
Now, why Bayes specifically? I would be happy to point to Laplace instead of Bayes, personally, since Laplace seems to have been way smarter and a superior rationalist. But the trouble with naming methods of “thinking correctly” is that everyone wants to name their method “thinking correctly,” and so you rapidly trip over each other. “Rationalism,” for example, refers to a particular philosophical position which is very different from the modal position here at LW. Bayes is useful as a marker, but it is not necessary to come to those insights by way of Bayes.
(I will also note that not disagreeing with something and discovering something are very different thresholds. If someone has a perspective which allows them to generate novel, correct insights, that perspective is much more powerful than one which merely serves to verify that insights are correct.)
Yeah, I said if I were pretend to be a frequentist—but that didn’t involve suddenly becoming dumb :-)
I agree, but at this point context starts to matter a great deal. Are we talking about decision-making in regular life? Like, deciding which major to pick, who to date, what job offer to take? Or are we talking about some explicitly statistical environment where you try to build models, fit them, evaluate them, do out-of-sample forecasting, all that kind of things?
I think I would argue that recognizing biases (Tversky/Kahneman style) and trying to correct for them—avoiding them altogether seems too high a threshold—is different from what people call Bayesian approaches. The Bayesian way of updating on the evidence is part of “thinking correctly”, but there is much, much more than just that.
At least one (and I think several) of biases identified by Tversky and Kahneman is “people do X, a Bayesian would do Y, thus people are wrong,” so I think you’re overstating the difference. (I don’t know enough historical details to be sure, but I suspect Tversky and Kahneman might be an example of the Bayesian approach allowing someone to discover novel, correct insights.)
I agree, but it feels like we’re disagreeing. It seems to me that a major Less Wrong project is “thinking correctly,” and a major part of that project is “decision-making under uncertainty,” and a major part of uncertainty is dealing with probabilities, and the Bayesian way of dealing with probabilities seems to be the best, especially if you want to use those probabilities for decision-making.
So it sounds to me like you’re saying “we don’t just need stats textbooks, we need Less Wrong.” I agree; that’s why I’m here as well as reading stats textbooks. But it also sounds to me like you’re saying “why are you naming this Less Wrong stuff after a stats textbook?” The easy answer is that it’s a historical accident, and it’s too late to change it now. Another answer I like better is that much of the Less Wrong stuff comes from thinking about and taking seriously the stuff from the stats textbook, and so it makes sense to keep the name, even if we’re moving to realms where the connection to stats isn’t obvious.
Hm… Let me try to unpack my thinking, in particular my terminology which might not match exactly the usual LW conventions. I think of:
Bayes theorem as a simple, conventional, and an entirely uncontroversial statistical procedure. If you ask a dyed-in-the-wool rabid frequentist whether the Bayes theorem is true he’ll say “Yes, of course”.
Bayesian statistics as an approach to statistics with three main features. First is the philosophical interpretation of (some) probability as subjective belief. Second is the focus on conditional probabilities. Third is the strong preferences for full (posterior) distributions as answers instead of point estimates.
Cognitive biases (aka the Kahneman/Tversky stuff) as certain distortions in the way our wetware processes information about reality, as well as certain peculiarities in human decision-making. Yes, a lot of it it is concerned with dealing with uncertainty. Yes, there is some synergy with Bayesian statistics. No, I don’t think this synergy is the defining factor here.
I understand that historically the in the LW community Bayesian statistics and cognitive biases were intertwined. But apart from historical reasons, it seems to me these are two different things and the degree of their, um, interpenetration is much overstated on LW.
Well, we need for which purpose? For real-life decision making? -- sure, but then no one is claiming that stats textbooks are sufficient for that.
Some, not much. I can argue that much of LW stuff comes from thinking logically and following chains of reasoning to their conclusion—or actually just comes from thinking at all instead of reacting instinctively / on the basis of a gut feeling or whatever.
I agree that thinking in probabilities is a very big step and it *is* tied to Bayesian statistics. But still it’s just one step.
I agree with your terminology.
When contrasting LW stuff and mainstream rationality, I think the reliance on thinking in probabilities is a big part of the difference. (“Thinking logically,” for the mainstream, seems to be mostly about logic of certainty.) When labeling, it makes sense to emphasize contrasting features. I don’t think that’s the only large difference, but I see an argument (which I don’t fully endorse) that it’s the root difference.
(For example, consider evolutionary psychology, a moderately large part of LW. This seems like a field of science particularly prone to uncertainty, where “but you can’t prove X!” would often be a conversation-stopper. For the Bayesian, though, it makes sense to update in the direction of evo psych, even though it can’t be proven, which is then beneficial to the extent that evo psych is useful.)
Yes, I think you’re right.
Um, I’m not so sure about that. The main accusation against evolutionary psychology is that it’s nothing but a bunch of just-so stories, aka unfalsifiable post-hoc narratives. And a Bayesian update should be on the basis of evidence, not on the basis of an unverifiable explanation.
It seems to me that if you think in terms of likelihoods, you look at a story and say “but the converse of this story has high enough likelihood that we can’t rule it out!” whereas if you think in terms of likelihood ratios, you say “it seems that this story is weakly more plausible than its converse.”
I’m thinking primarily of comments like this. I think it is a reasonable conclusion that anger seems to be a basic universal emotion because ancestors who had the ‘right’ level of anger reproduced more than those who didn’t. Boris just notes that it could be the case that anger is a byproduct of something else, but doesn’t note anything about the likelihood of anger being universal in a world where it is helpful (very high) and the likelihood of anger being universal in a world where it is neutral or unhelpful (very low). We can’t rule out anger being spurious, but asking to rule that out is mistaken, I think, because the likelihood ratio is so significant. It doesn’t make sense to bet against anger being reproductively useful in the ancestral environment (but I think it makes sense to assign a probability to that bet, even if it’s not obvious how one would resolve it).
I have several problems with this line of reasoning. First, I am unsure what it means for a story to be true. It’s a story—it arranges a set of facts in a pattern pleasing to the human brain. Not contradicting any known facts is a very low threshold (see the Russell’s teapot), to call something “true” I’ll need more than that and if a story makes no testable predictions I am not sure on which basis I should evaluate its truth and what does it even mean.
Second, it seems to me that in such situations the likelihoods and so, necessarily, their ratios are very very fuzzy. My meta uncertainty—uncertainty about probabilities—is quite high. I might say “story A is weakly more plausible than story B” but my confidence in my judgment about plausibility is very low. This judgment might not be worth anything.
Third, likelihood ratios are good when you know you have a complete set of potential explanations. And you generally don’t. For open-ended problems the explanation “something else” frequently looks like the more plausible one, but again, the meta uncertainty is very high—not only you don’t know how uncertain you are, you don’t even know what you are uncertain about! Nassim Taleb’s black swans are precisely the beasties that appear out of “something else” to bite you in the ass.
Ah, by that I generally mean something like “the causal network N with a particular factorization F is the underlying causal representation of reality,” and so a particular experiment measures data and then we calculate “the aforementioned causal network would generate this data with probability P” for various hypothesized causal networks.
For situations where you can control at least one of the nodes, it’s easy to see how you can generate data useful for this. For situations where you only have observational data (like the history of human evolution, mostly), then it’s trickier to determine which causal network(s) is(are) best, but often still possible to learn quite a bit more about the underlying structure than is obvious at first glance.
So suppose we have lots of historical lives which are compressed down to two nodes, A which measures “anger” (which is integer-valued and non-negative, say) and C which measures “children” (which is also integer valued and non-negative). The story “anger is spurious” is the network where A and C don’t have a link between them, and the story “anger is reproductively useful” is the network where A->C and there is some nonzero value a^* of A which maximizes the expected value of C. If we see a relationship between A and C in the data, it’s possible that the relationship was generated by the “anger is spurious” network which said those variables were independent, but we can calculate the likelihoods and determine that it’s very very low, especially as we accumulate more and more data.
Sure. But even if you’re only aware of two hypotheses, it’s still useful to use the LR to determine which to prefer; the supremacy of a third hidden hypothesis can’t swap the ordering of the two known hypotheses!
Yes, reversal effects are always possible, but I think that putting too much weight on this argument leads to Anton-Wilsonism (certainty is necessary but impossible). I think we do often have a good idea of what our meta uncertainty looks like in a lot of cases, and that’s generally enough to get the job done.
I have only glanced at Pearl’s work, not read it carefully, so my understanding of causal networks is very limited. But I don’t understand on the basis of which data will you construct the causal network for anger and children (and it’s actually more complicated because there are important society-level effects). In what will you “see a relationship between A and C”? On the basis of what will you be calculating the likelihoods?
Ideally, you would have some record. I’m not an expert in evo psych, so I can’t confidently say what sort of evidence they actually rely on. I was hoping more to express how I would interpret a story as a formal hypothesis.
I get the impression that a major technique in evolutionary psychology is making use of the selection effect due to natural selection: if you think that A is heritable, and that different values of A have different levels of reproductive usefulness, then in steady state the distribution of A in the population gives you information about the historic relationship between A and reproductive usefulness, without even measuring relationship between A and C in this generation. So you can ask the question “what’s the chance of seeing the cluster of human anger that we have if there’s not a relationship between A and reproduction?” and get answers that are useful enough to focus most of your attention on the “anger is reproductively useful” hypothesis.