That is very interesting, mostly because I do exactly think that people are putting too much faith in textbook science. I’m also a little bit uncomfortable with the suggested classification.
I have high confidence in claims that I think are at low risk of being falsified soon, not because it is settled science but because this sentence is a tautology. The causality runs the other way: if our confidence in the claim is high, we provisionally accept it as knowledge.
By contrast, I am worried about the social process of claims moving from unsettled to settled science. In my personal opinion there is an abundance of overconfidence in what we would call “settled science”. The majority of the claims therein are likely to be correct and hold up under scrutiny, but the bar is still lower than I would prefer.
But maybe I’m way off the mark here, or maybe we are splitting hairs and describing the same situation from a different angle. There is lots of good science out there, and you need overwhelming evidence to justify questioning a standard textbook. But there is also plenty of junk that makes it all the way into lecture halls, never mind all the previous hoops it had to pass through to get there. I am very worried about the statistical power of our scientific institutes in separating truth from fiction, and I don’t think the settled/unsettled distinction helps address this.
I have another way of stating my concern with the rhetoric and thought here.
People start as “level 1” readers of science, and they may end up leveling up as they read more. One of the “skill slots” they can improve on is their skepticism. This means understanding intuitively about how much confidence to place in a claim, and why.
To me, this line of argument is mainly aimed at those “level 1” readers. The message is “Hey, there’s a lot of junk out there, and some of it even makes it into textbooks! It’s hard to say how much, but watch out!” That sentence is useful to its audience if it builds more accurate intuitions about how to interpret science. And it’s clear that it might well have that effect in a nonzero number of cases.
However, it seems to me that it could also build worse intuitions about how to read science in “level 1” readers, by causing them to wildly overcorrect. For example, I have a friend who is deep into intelligent design, and has surrounded himself with other believers in ID (who are PhD-holding scientists). He views them as mentors. They’ve taught him not only about ID, but also a robust set of techniques for absolutely trashing every piece of research into evolution that he gets his hands on. It’s a one-sided demand for rigor, to be sure, but it’s hard to see or accept that when your community of practice has downleveled your ability to read scientific literature.
I spend quite a bit of time reading the output of the online rationalist and rat-adjacent community. I see almost no explicit writing on when and why we should sometimes believe the contents of scientific literature, and a gigantic amount of writing on why we should be profoundly skeptical that it has truth content. I see a one-sided demand for rigor in this, on a community-wide level.
It’s this problem that I am trying to correct for, by being skeptical of the skepticism, using its own heuristics:
We should be careful before we extrapolate.
There is a range of appropriate intuitive priors for published peer-reviewed literature, ranging “unsettled” to “settled.” We should determine that prior when we consider the truth-value of a particular claim.
Here’s how I might express this to a “level 1” reader of science:
In the soft sciences, scientists can identify datasets and hypotheses where they’re confident that the garden of forking paths has many destinations. This aligns with statistical studies of replication rate failures, as well as our knowledge of the limited resources and incentive structures in scientific research. Furthermore, not all experimental methods are really appropriate for the thematic questions they claim to investigate.
Given all this, it’s wise to be skeptical of individual experiments. And we should be aware of the perspective Scott Alexander articulated in “Beware the Man of One Study,” where for some topics, like the effect of the minimum wage, the challenge of parsing the research either requires a dedicated expert, or simply can’t be done responsibly by anyone.
HOWEVER.
The emerging field of what we might call “empirical skepticism” is itself unsettled science. It’s a hot, relatively new field. Its methods are difficult to interpret, and may not be reliable in some cases. It is subject to exactly the same biases and incentive structures as the rest of science. We should have the same careful, skeptical attitude when we interpret its results as we’d have for any other new field.
If there are real stakes for your reading of science, and you’re trying earnestly to find the truth, then there’s a whole suite of questions to consider both to select which publications to prioritize for your reading, and how to interpret what you find there. What is the mechanism of action? How much risk is there for a garden of forking paths? Do the researchers stand to benefit the most from getting a publication, or from finding the truth? Are the methods appropriate for the question at hand? How reputable is the source? Do you have enough background knowledge to interpret the object-level claims? Is this in the “hard” or the “soft” sciences?
As you build skill in reading science, and get plugged into the right recommendation networks, you’ll gain more savvy for which of these questions are most important to address for any particular paper. If you read a lot, and ask good questions along the way, you’ll stumble your way toward a more sophisticated understanding of the literature, and possibly even the truth!
I’ve upvoted you for the clear presentation. Most of the points you state are beliefs I held several years ago, and sounded perfectly reasonable to me. However, over time the track record of this view worsened and worsened, to the point where I now disagree not so much on the object level as with the assumption that this view is valuable to have. I hope you’ll bear with me as I try to give explaining this a shot.
I think the first, major point of disagreement is that the target audience of a paper like this is the “level 1” readers. To me it seems like the target audience consists of scientists and science fans, most of whom already have a lot of faith in the accuracy of the scientific process. It is completely true that showing this piece to someone who has managed to work their way into an unreasonable belief can make it harder to escape that particular trap, but unfortunately that doesn’t make it wrong. That’s the valley of bad rationality and all that. In fact, I think that strongly supports my main original claim—there are so many ways of using sophisticated arguments to get to a wrong conclusion, and only one way to accurately tally up the evidence, that it takes skill and dedication to get to the right answer consistently.
I’m sorry to hear about your friend, and by all means try to keep them away from posts like this. If I understand correctly, you are roughly saying “Science is difficult and not always accurate, but posts like this overshoot on the skepticism. There is some value in trusting published peer-reviewed science over the alternatives, and this view is heavily underrepresented in this community. We need to acknowledge this to dodge the most critical of errors, and only then look for more nuanced views on when to place exactly how much faith in the statements researchers make.” I hope I’m not misrepresenting your view here, this is a statement I used to believe sincerely. And I still think that science has great value, and published research is the most accurate source of information out there. But I no longer believe that this “level 2 view”, extrapolating (always dangerous :P) from your naming scheme, is a productive viewpoint. I think the nuance that I would like to introduce is absolutely essential, and that conflating different fields of research or even research questions within a field under this umbrella does more harm than good. In other words, I would like to discuss the accuracy of modern science with the understanding that this may apply to smaller or larger degree to any particular paper, exactly proportional to the hypothetical universe-separating ability of the data I introduced earlier. I’m not sure if I should spell that out in great detail every couple of sentences to communicate that I am not blanket arguing against science, but rather comparing science-as-practiced with truthfinding-in-theory and looking for similarities and differences on a paper-by-paper basis.
Most critically, I think the image of ‘overshooting’ or ‘undershooting’ trust in papers in particular or science in general is damaging to the discussion. Evaluating the accuracy of inferences is a multi-faceted problem. In some sense, I feel like you are pointing out that if we are walking in a how-much-should-I-trust-science landscape, to a lot of people the message “it’s really not all it’s cracked up to be” would be moving further away from the ideal point. And I agree. But simultaneously, I do not know of a way to get close (not “help the average person get a bit closer”, but get really close) to the ideal point without diving into this nuance. I would really like to discuss in detail what methods we have for evaluating the hard work of scientists to the best of our ability. And if some of that, taken out of context, forms an argument in the arsenal of people determined to metaphorically shoot their own foot off that is a tragedy but I would still like to have the discussion.
As an example, in your quote block I love the first paragraph but think the other 4 are somewhere between irrelevant and misleading. Yes, this discussion will not be a panacea to the replication crisis, and yes, without prior experience comparing crackpots to good sources you may well go astray on many issues. Despite all that, I would still really like to discuss how to evaluate modern science. And personally I believe that we are collectively giving it more credit than it deserves, which is spread in complicated ways between individual claims, research topics and entire fields of science.
I pulled out statements of your positive statements and beliefs, and give my response to each.
To me it seems like the target audience consists of scientists and science fans, most of whom already have a lot of faith in the accuracy of the scientific process.
This makes sense as a target audience, and why you’d place a difference emphasis than I do in addressing them.
There are so many ways of using sophisticated arguments to get to a wrong conclusion, and only one way to accurately tally up the evidence, that it takes skill and dedication to get to the right answer consistently.
Agreed.
I think the nuance that I would like to introduce is absolutely essential, and that conflating different fields of research or even research questions within a field under this umbrella does more harm than good. In other words, I would like to discuss the accuracy of modern science with the understanding that this may apply to smaller or larger degree to any particular paper, exactly proportional to the hypothetical universe-separating ability of the data I introduced earlier.
So it sounds like you agree with my point that the appropriate confidence level in particular findings and theories may vary widely across publication types and fields. You just want to make the general point that you can’t trust everything you read, with the background understanding that sometimes this is more important, and sometimes less. So our points are, at least here, a matter of where we wish to place emphasis?
I feel like you are pointing out that if we are walking in a how-much-should-I-trust-science landscape, to a lot of people the message “it’s really not all it’s cracked up to be” would be moving further away from the ideal point. And I agree.
Agreed.
I would really like to discuss in detail what methods we have for evaluating the hard work of scientists to the best of our ability.
My interpretation of this is that you want to have a discussion focused on “what methods do we have for figuring out how accurate published scientific findings are?” without worry about how this might get misinterpreted by somebody who already has too little trust in science. Is that right?
I would still really like to discuss how to evaluate modern science. And personally I believe that we are collectively giving it more credit than it deserves, which is spread in complicated ways between individual claims, research topics and entire fields of science.
This seems like a really odd claim to me, depending crucially on who you mean by “we.” Again, if you mean science enthusiasts, then OK, that makes sense. If you mean, say, the first-world public, then I’d point out that two big persistent stories these days are the anti-vax people and climate change skeptics. But my guess is that you mean the former, and I agree with you that I have had a fair number of frustrating discussions with somebody who thinks the responsible position is to swallow scientific studies in fields like nutrition and psychology hook, line, and sinker.
Even if we set aside my concerns about how an audience with low trust in science might interpret this stuff, I still think that my points stand. We should be careful about extrapolation, and read these studies with the same general skepticism we should be applying to other studies. That might seem “misleading and irrelevant” to you, but I really don’t understand why. They’re good basic reminders for any discussion of science.
I agree with your reading of my points 1,2,4 and 5 but think we are not seeing eye to eye on points 3 and 6. It also saddens me that you condensed the paragraph on how I would like to view the how-much-should-we-trust-science landscape to its least important sentence (point 4), at least from my point of view.
As for point 3, I do not want to make a general point about the reliability of science at all. I want to discuss what tools we have to evaluate the accuracy of any particular paper or claim, so that we can have more appropriate confidence across the board. I think this is the most important discussion regardless of whether it increases or decreases general confidence. In my opinion, attempting to give a 0th-order summary by discussing the average change in confidence from this approach is doing more harm than good. The sentence “You just want to make the general point that you can’t trust everything you read, with the background understanding that sometimes this is more important, and sometimes less.” is exactly backwards from what I am trying to say.
For point 6, I think it might be very relevant to point out that I’m European, and the anti-vax and global warming denialism really is not that popular around where I live. They are more considered stereotypes of being untrustworthy than properly held beliefs, thankfully. But ignoring that, I think that most of the people influencing social policy and making important decisions are leaning heavily on science, and unfortunately particularly on the types of science I have the lowest confidence in. I was hoping to avoid going into great detail on this, but as short summary I think it is reasonable to be less concerned with the accuracy of papers that have low (societal) impact and more concerned with papers that have high impact. If you randomly sample a published paper on Google Scholar or whatever I’ll happily agree that you are likely to find an accurate piece of research. But this is not an accurate representation of how people encounter scientific studies in reality. I see people break the fourth virtue all the way from coffeehouse discussions to national policy debates, which is so effective precisely because the link between data and conclusion is murky. So a lot of policy proposals can be backed by some amount of references. Over the past few years my attempts to be more even have led me to strongly decrease my confidence in a large number of scientific studies, if only to account for the selection effect that these, and not others, were brought to my attention.
Also I think psychology and nutrition are doing a lot better than they were a decade or two ago, which I consider a great sign. But that’s more of an aside than a real point.
This makes a lot of sense, actually. You’re focused on mechanisms that a good thinker could use to determine whether or not a particular scientific finding is true or not. I’m worried about the ways that the conversation around skepticism can and does go astray.
Perhaps I read some of the quotes from the papers uncharitably. Silberzahn asks “What if scientific results are highly contingent on subjective decisions at the analysis stage?” I interpreted this question, in conjunction with the paper’s conclusion, as pointing to a line of thinking that goes something like this:
What if scientific results are highly contingent on subjective decisions at the analysis stage?
Some scientific results are highly contingent on subjective decisions at the analysis stage.
What if ALL scientific results are highly contingent on subjective decisions at the analysis stage across the board???!!!
But a more charitable version for the third step is:
3. This method helped us uncover one such case, and might help us uncover more. Also, it’s a reminder to avoid overconfidence in published research, especially in politically charged and important issues where good evidence is hard to come by.
I spent the last ten years teaching children, and so my default mode is one of “educating the young and naive to be a little more sophisticated.” Part of my role was to sequence and present ideas with care in order to increase the chance that an impressionable and naive young mind absorbed the healthy version of an idea, rather than a damaging misinterpretation. Maybe that informs the way I perceive this debate.
That is very interesting, mostly because I do exactly think that people are putting too much faith in textbook science. I’m also a little bit uncomfortable with the suggested classification.
I have high confidence in claims that I think are at low risk of being falsified soon, not because it is settled science but because this sentence is a tautology. The causality runs the other way: if our confidence in the claim is high, we provisionally accept it as knowledge.
By contrast, I am worried about the social process of claims moving from unsettled to settled science. In my personal opinion there is an abundance of overconfidence in what we would call “settled science”. The majority of the claims therein are likely to be correct and hold up under scrutiny, but the bar is still lower than I would prefer.
But maybe I’m way off the mark here, or maybe we are splitting hairs and describing the same situation from a different angle. There is lots of good science out there, and you need overwhelming evidence to justify questioning a standard textbook. But there is also plenty of junk that makes it all the way into lecture halls, never mind all the previous hoops it had to pass through to get there. I am very worried about the statistical power of our scientific institutes in separating truth from fiction, and I don’t think the settled/unsettled distinction helps address this.
I have another way of stating my concern with the rhetoric and thought here.
People start as “level 1” readers of science, and they may end up leveling up as they read more. One of the “skill slots” they can improve on is their skepticism. This means understanding intuitively about how much confidence to place in a claim, and why.
To me, this line of argument is mainly aimed at those “level 1” readers. The message is “Hey, there’s a lot of junk out there, and some of it even makes it into textbooks! It’s hard to say how much, but watch out!” That sentence is useful to its audience if it builds more accurate intuitions about how to interpret science. And it’s clear that it might well have that effect in a nonzero number of cases.
However, it seems to me that it could also build worse intuitions about how to read science in “level 1” readers, by causing them to wildly overcorrect. For example, I have a friend who is deep into intelligent design, and has surrounded himself with other believers in ID (who are PhD-holding scientists). He views them as mentors. They’ve taught him not only about ID, but also a robust set of techniques for absolutely trashing every piece of research into evolution that he gets his hands on. It’s a one-sided demand for rigor, to be sure, but it’s hard to see or accept that when your community of practice has downleveled your ability to read scientific literature.
I spend quite a bit of time reading the output of the online rationalist and rat-adjacent community. I see almost no explicit writing on when and why we should sometimes believe the contents of scientific literature, and a gigantic amount of writing on why we should be profoundly skeptical that it has truth content. I see a one-sided demand for rigor in this, on a community-wide level.
It’s this problem that I am trying to correct for, by being skeptical of the skepticism, using its own heuristics:
We should be careful before we extrapolate.
There is a range of appropriate intuitive priors for published peer-reviewed literature, ranging “unsettled” to “settled.” We should determine that prior when we consider the truth-value of a particular claim.
Here’s how I might express this to a “level 1” reader of science:
I’ve upvoted you for the clear presentation. Most of the points you state are beliefs I held several years ago, and sounded perfectly reasonable to me. However, over time the track record of this view worsened and worsened, to the point where I now disagree not so much on the object level as with the assumption that this view is valuable to have. I hope you’ll bear with me as I try to give explaining this a shot.
I think the first, major point of disagreement is that the target audience of a paper like this is the “level 1” readers. To me it seems like the target audience consists of scientists and science fans, most of whom already have a lot of faith in the accuracy of the scientific process. It is completely true that showing this piece to someone who has managed to work their way into an unreasonable belief can make it harder to escape that particular trap, but unfortunately that doesn’t make it wrong. That’s the valley of bad rationality and all that. In fact, I think that strongly supports my main original claim—there are so many ways of using sophisticated arguments to get to a wrong conclusion, and only one way to accurately tally up the evidence, that it takes skill and dedication to get to the right answer consistently.
I’m sorry to hear about your friend, and by all means try to keep them away from posts like this. If I understand correctly, you are roughly saying “Science is difficult and not always accurate, but posts like this overshoot on the skepticism. There is some value in trusting published peer-reviewed science over the alternatives, and this view is heavily underrepresented in this community. We need to acknowledge this to dodge the most critical of errors, and only then look for more nuanced views on when to place exactly how much faith in the statements researchers make.” I hope I’m not misrepresenting your view here, this is a statement I used to believe sincerely. And I still think that science has great value, and published research is the most accurate source of information out there. But I no longer believe that this “level 2 view”, extrapolating (always dangerous :P) from your naming scheme, is a productive viewpoint. I think the nuance that I would like to introduce is absolutely essential, and that conflating different fields of research or even research questions within a field under this umbrella does more harm than good. In other words, I would like to discuss the accuracy of modern science with the understanding that this may apply to smaller or larger degree to any particular paper, exactly proportional to the hypothetical universe-separating ability of the data I introduced earlier. I’m not sure if I should spell that out in great detail every couple of sentences to communicate that I am not blanket arguing against science, but rather comparing science-as-practiced with truthfinding-in-theory and looking for similarities and differences on a paper-by-paper basis.
Most critically, I think the image of ‘overshooting’ or ‘undershooting’ trust in papers in particular or science in general is damaging to the discussion. Evaluating the accuracy of inferences is a multi-faceted problem. In some sense, I feel like you are pointing out that if we are walking in a how-much-should-I-trust-science landscape, to a lot of people the message “it’s really not all it’s cracked up to be” would be moving further away from the ideal point. And I agree. But simultaneously, I do not know of a way to get close (not “help the average person get a bit closer”, but get really close) to the ideal point without diving into this nuance. I would really like to discuss in detail what methods we have for evaluating the hard work of scientists to the best of our ability. And if some of that, taken out of context, forms an argument in the arsenal of people determined to metaphorically shoot their own foot off that is a tragedy but I would still like to have the discussion.
As an example, in your quote block I love the first paragraph but think the other 4 are somewhere between irrelevant and misleading. Yes, this discussion will not be a panacea to the replication crisis, and yes, without prior experience comparing crackpots to good sources you may well go astray on many issues. Despite all that, I would still really like to discuss how to evaluate modern science. And personally I believe that we are collectively giving it more credit than it deserves, which is spread in complicated ways between individual claims, research topics and entire fields of science.
I pulled out statements of your positive statements and beliefs, and give my response to each.
Even if we set aside my concerns about how an audience with low trust in science might interpret this stuff, I still think that my points stand. We should be careful about extrapolation, and read these studies with the same general skepticism we should be applying to other studies. That might seem “misleading and irrelevant” to you, but I really don’t understand why. They’re good basic reminders for any discussion of science.
I agree with your reading of my points 1,2,4 and 5 but think we are not seeing eye to eye on points 3 and 6. It also saddens me that you condensed the paragraph on how I would like to view the how-much-should-we-trust-science landscape to its least important sentence (point 4), at least from my point of view.
As for point 3, I do not want to make a general point about the reliability of science at all. I want to discuss what tools we have to evaluate the accuracy of any particular paper or claim, so that we can have more appropriate confidence across the board. I think this is the most important discussion regardless of whether it increases or decreases general confidence. In my opinion, attempting to give a 0th-order summary by discussing the average change in confidence from this approach is doing more harm than good. The sentence “You just want to make the general point that you can’t trust everything you read, with the background understanding that sometimes this is more important, and sometimes less.” is exactly backwards from what I am trying to say.
For point 6, I think it might be very relevant to point out that I’m European, and the anti-vax and global warming denialism really is not that popular around where I live. They are more considered stereotypes of being untrustworthy than properly held beliefs, thankfully. But ignoring that, I think that most of the people influencing social policy and making important decisions are leaning heavily on science, and unfortunately particularly on the types of science I have the lowest confidence in. I was hoping to avoid going into great detail on this, but as short summary I think it is reasonable to be less concerned with the accuracy of papers that have low (societal) impact and more concerned with papers that have high impact. If you randomly sample a published paper on Google Scholar or whatever I’ll happily agree that you are likely to find an accurate piece of research. But this is not an accurate representation of how people encounter scientific studies in reality. I see people break the fourth virtue all the way from coffeehouse discussions to national policy debates, which is so effective precisely because the link between data and conclusion is murky. So a lot of policy proposals can be backed by some amount of references. Over the past few years my attempts to be more even have led me to strongly decrease my confidence in a large number of scientific studies, if only to account for the selection effect that these, and not others, were brought to my attention.
Also I think psychology and nutrition are doing a lot better than they were a decade or two ago, which I consider a great sign. But that’s more of an aside than a real point.
This makes a lot of sense, actually. You’re focused on mechanisms that a good thinker could use to determine whether or not a particular scientific finding is true or not. I’m worried about the ways that the conversation around skepticism can and does go astray.
Perhaps I read some of the quotes from the papers uncharitably. Silberzahn asks “What if scientific results are highly contingent on subjective decisions at the analysis stage?” I interpreted this question, in conjunction with the paper’s conclusion, as pointing to a line of thinking that goes something like this:
What if scientific results are highly contingent on subjective decisions at the analysis stage?
Some scientific results are highly contingent on subjective decisions at the analysis stage.
What if ALL scientific results are highly contingent on subjective decisions at the analysis stage across the board???!!!
But a more charitable version for the third step is:
3. This method helped us uncover one such case, and might help us uncover more. Also, it’s a reminder to avoid overconfidence in published research, especially in politically charged and important issues where good evidence is hard to come by.
I spent the last ten years teaching children, and so my default mode is one of “educating the young and naive to be a little more sophisticated.” Part of my role was to sequence and present ideas with care in order to increase the chance that an impressionable and naive young mind absorbed the healthy version of an idea, rather than a damaging misinterpretation. Maybe that informs the way I perceive this debate.
Just wanted to confirm you have accurately described my thoughts, and I feel I have a better understanding of your position as well now.