When this question was posted a month ago, I liked it so much that I offered $100 of my own money for what I judged to be the best answer and another $50 to the best distillation. Here’s what I think:
I will reach out to these authors via DM to arrange payment.
No one attempted to me what seemed like a proper distillation of other responses so I won’t be awarding the distillation prize here, however I intend to write and publish my own distillation/synthesis of the responses soon.
Some thoughts on each of the replies:
Unnamed [winner]: This answer felt very thorough and detailed, and it feels like it’s a guide I could really follow to dramatically improve my ability to assess studies. I’m assuming limitations of LW’s current editor meant the formatting couldn’t be nicer, but I also really like Unnamed broke down his overall response into three main questions (“Is this just noise?”, “Is there anything interesting going on here?” and “What is going on here?”) and then presented further sub-questions and examples to help one assess the high-level questions.
I’d like to better summarize Unnamed’s response, you should really just read it all.
waveman [winner]: waveman’s reply hits a solid amount of breadth in how to assess studies. I feel like his response is any easy guide I could pin up my wall and easily step through while reading papers. What I would really like to see is this response except further fleshed out with examples and resources, e.g. “read these specific papers or books on how studies get rigged.” I’ll note that I do have some pause with this response since other responders contradicted at least one part of it, e.g., Kristin Lindquist saying not to worry about the funding source of a study. I’d like to see these (perhaps only surface-level) disagreements resolved. Overall though, really solid answer that deserves its karma.
Bucky [winner]: Bucky’s answer is deliciously technical. Rather than discussing high-level qualitative consequences to pay attention to (e.g. funding source, has there been reproductions), Bucky dives and provides actual forumulas and guidance about sample sizes, effect sizes, etc. What’s more, Bucky discusses how he applied this approach to concrete studies (80k’s replication quiz) and the outcome. I love the detail of the reply and it being backed up by concrete usage. I will mention that Bucky opens by saying that he uses subconscious thresholds in his assessments but is interesting in discussing the levels other people use.
I do suspect that learning to apply the kinds of calculations Bucky points at is tricky and vulnerable to mistaken application. Probably a longer resource/more training is needed to be able to apply Bucky’s approach successfully, but his answer at the least sets one on the right path.
Kristin Lindquist: Kristin’s answer is really very solid but feels like it falls short of the leading responses in terms of depth and guidance and doesn’t add too much, though I do appreciate the links that were included. It’s a pretty good summary. Also one of the best formatted of all answers given. I would like to see waveman and Kristin reach agreement on the question of looking funding sources.
jimrandomh: Jim’s answer was short but added important answers to the conversation that no one else had stated. I think his suggestion of ensuring you ask yourself about how you ended up reading a particular study is excellent and crucial. I’m also intrigued by his response that controlling for confounds is much, much harder than people typically think. I’d very much like to see a longer essay demonstrating this.
Elizabeth: I feel like this answer solidly reminds me think to about core epistemological questions when reading a study, e.g., “how do they know this?”
Romeostevensit: this answer added a few more things to look for not not included in other responses, e.g. giving more to authors who discuss what can’t be concluded from their study. Also I like his mentioning that spurious effects can sneak into despite the honest intentions of moderately competent scientists. My experience with data analysis supports this. I’d like to see a discussion between Romeostenvsit and jimrandhomh since they both seem to have thoughts about confounds (and I further know they both have interest in nutrition research).
Charlie Steiner: Good additional detail in this one, e.g. the instruction to compare papers to other similar papers and general encouragement to get a sense of what methods are reasonable. This is a good answer, just not as good as the very top answers. Would like to see some concrete examples to learn from with this one. I appreciate the clarification that this response is for Condensed Matter Physics. I’d be curious to see how other researchers feel it generalizes to their domains.
whales: Good advice and they could be right that a lot of key knowledge is tacit (in the oral tradition) and not included in papers or textbooks. That seems like something well worth remembering. I’d be rather keen to see whales’s course on layperson evaluation of science.
The Major: Response seems congruent with other answers but is much shorter and less detailed them.
It would be good know if offering prizes like this is helpful in producing counterfactually more and better responses. So, to all those who responded with the great answers, I have a question:
How did the offer of a prize influence your contribution? Did it make any difference? If so, how come?
Good summary of my answer; by the time I got round to writing mine there were so many good qualitative summaries I wanted to do something different. I think you’ve hit the nail on the head with the main weakness being difficulty in application, particularly in estimating Cohen’s d.
I am currently taking part in replication markets and basing my judgements mainly on experimental power. Hopefully this will give me a better idea of what works and I may write an updated guide next year.
As a data point r.e. the prize, I’m pretty sure that if the prize wasn’t there I would have done my usual and intended to write something and never actually got round to it. I think this kind of prize is particularly useful for questions which take a while to work on and attention would otherwise drift.
Hopefully this will give me a better idea of what works and I may write an updated guide next year.
I’d be excited to see that.
As a data point r.e. the prize, I’m pretty sure that if the prize wasn’t there I would have done my usual and intended to write something and never actually got round to it. I think this kind of prize is particularly useful for questions which take a while to work on and attention would otherwise drift.
Oh, that’s helpful to know and reminds me that I intended to ask respondents how the offer a prize affected their contributions.
Awards for the Best Answers
When this question was posted a month ago, I liked it so much that I offered $100 of my own money for what I judged to be the best answer and another $50 to the best distillation. Here’s what I think:
Overall prize for best answer ($100): Unnamed
Additional prizes ($25): waveman, Bucky
I will reach out to these authors via DM to arrange payment.
No one attempted to me what seemed like a proper distillation of other responses so I won’t be awarding the distillation prize here, however I intend to write and publish my own distillation/synthesis of the responses soon.
Some thoughts on each of the replies:
Unnamed [winner]: This answer felt very thorough and detailed, and it feels like it’s a guide I could really follow to dramatically improve my ability to assess studies. I’m assuming limitations of LW’s current editor meant the formatting couldn’t be nicer, but I also really like Unnamed broke down his overall response into three main questions (“Is this just noise?”, “Is there anything interesting going on here?” and “What is going on here?”) and then presented further sub-questions and examples to help one assess the high-level questions.
I’d like to better summarize Unnamed’s response, you should really just read it all.
waveman [winner]: waveman’s reply hits a solid amount of breadth in how to assess studies. I feel like his response is any easy guide I could pin up my wall and easily step through while reading papers. What I would really like to see is this response except further fleshed out with examples and resources, e.g. “read these specific papers or books on how studies get rigged.” I’ll note that I do have some pause with this response since other responders contradicted at least one part of it, e.g., Kristin Lindquist saying not to worry about the funding source of a study. I’d like to see these (perhaps only surface-level) disagreements resolved. Overall though, really solid answer that deserves its karma.
Bucky [winner]: Bucky’s answer is deliciously technical. Rather than discussing high-level qualitative consequences to pay attention to (e.g. funding source, has there been reproductions), Bucky dives and provides actual forumulas and guidance about sample sizes, effect sizes, etc. What’s more, Bucky discusses how he applied this approach to concrete studies (80k’s replication quiz) and the outcome. I love the detail of the reply and it being backed up by concrete usage. I will mention that Bucky opens by saying that he uses subconscious thresholds in his assessments but is interesting in discussing the levels other people use.
I do suspect that learning to apply the kinds of calculations Bucky points at is tricky and vulnerable to mistaken application. Probably a longer resource/more training is needed to be able to apply Bucky’s approach successfully, but his answer at the least sets one on the right path.
Kristin Lindquist: Kristin’s answer is really very solid but feels like it falls short of the leading responses in terms of depth and guidance and doesn’t add too much, though I do appreciate the links that were included. It’s a pretty good summary. Also one of the best formatted of all answers given. I would like to see waveman and Kristin reach agreement on the question of looking funding sources.
jimrandomh: Jim’s answer was short but added important answers to the conversation that no one else had stated. I think his suggestion of ensuring you ask yourself about how you ended up reading a particular study is excellent and crucial. I’m also intrigued by his response that controlling for confounds is much, much harder than people typically think. I’d very much like to see a longer essay demonstrating this.
Elizabeth: I feel like this answer solidly reminds me think to about core epistemological questions when reading a study, e.g., “how do they know this?”
Romeostevensit: this answer added a few more things to look for not not included in other responses, e.g. giving more to authors who discuss what can’t be concluded from their study. Also I like his mentioning that spurious effects can sneak into despite the honest intentions of moderately competent scientists. My experience with data analysis supports this. I’d like to see a discussion between Romeostenvsit and jimrandhomh since they both seem to have thoughts about confounds (and I further know they both have interest in nutrition research).
Charlie Steiner: Good additional detail in this one, e.g. the instruction to compare papers to other similar papers and general encouragement to get a sense of what methods are reasonable. This is a good answer, just not as good as the very top answers. Would like to see some concrete examples to learn from with this one. I appreciate the clarification that this response is for Condensed Matter Physics. I’d be curious to see how other researchers feel it generalizes to their domains.
whales: Good advice and they could be right that a lot of key knowledge is tacit (in the oral tradition) and not included in papers or textbooks. That seems like something well worth remembering. I’d be rather keen to see whales’s course on layperson evaluation of science.
The Major: Response seems congruent with other answers but is much shorter and less detailed them.
It would be good know if offering prizes like this is helpful in producing counterfactually more and better responses. So, to all those who responded with the great answers, I have a question:
How did the offer of a prize influence your contribution? Did it make any difference? If so, how come?
Thanks Ruby.
Good summary of my answer; by the time I got round to writing mine there were so many good qualitative summaries I wanted to do something different. I think you’ve hit the nail on the head with the main weakness being difficulty in application, particularly in estimating Cohen’s d.
I am currently taking part in replication markets and basing my judgements mainly on experimental power. Hopefully this will give me a better idea of what works and I may write an updated guide next year.
As a data point r.e. the prize, I’m pretty sure that if the prize wasn’t there I would have done my usual and intended to write something and never actually got round to it. I think this kind of prize is particularly useful for questions which take a while to work on and attention would otherwise drift.
I’d be excited to see that.
Oh, that’s helpful to know and reminds me that I intended to ask respondents how the offer a prize affected their contributions.