A variety of methods all pointing in similar direction (post-workshop surveys, frequency of use and mention in one-on-ones with participants in the years afterward, takeup in the broader rationalist/EA/longtermist communities, mention in posts in other places like FB and LW, iteration and variation in content put out by alumni, etc). Dan Keys could probably say more concrete things.
I guess I should’ve also stressed the difference between alumni thinking it’s proven useful (which is easy to determine) and it actually being useful in accomplishing stuff (which is the more interesting and harder to determine part).
None of the things you mention seem like fool-proof methods of determining the second case.
True, but also Bayes. Be careful not to cargo-cult skepticism; when you’re dealing with a set of hundreds of people all telling you that Thing X was useful for them personally (especially when Thing X came from a menu of things, and it’s being highlighted particularly), the most reasonable hypothesis is that it was useful.
Like, yes, people self-deceive, and yes, people misdiagnose. But it would be genuinely silly and not particularly rational for someone in my shoes, being directly exposed to these reports, to weirdly privilege the “nuh-uh” hypothesis.
Seems reasonable for you, one degree removed, to maintain a little skepticism. But I think people … fetishize? … a certain kind of hard, sterilized data, instead of recognizing that that certain kind of hard, sterilized data is one path to confidence, among many. It is indeed important to be able to prove mathematically that something is a triangle, and you can build a much taller tower of knowledge on a foundation that sturdy, but you can also just look, and recognize triangles quite effectively.
TAPs are useful for humans, as literally dozens of studies have shown. They have also proven useful to CFAR’s alumni, who are also humans, and this is not a surprising result.
I’m not exactly claiming that it is or is not useful in improving outcomes, I’m just wondering if any “hard, sterilized” data exists.
Anyway...
Your comment does make me wonder some things. Take the following more as me exploring my state of mind on a lazy Sunday afternoon and less as a retort or rebuttal.
I’m not sure that we actually disagree...it’s hard to quantify the amount of skepticism we’re both talking about. You can’t say “you should be skeptical level 3” and I can’t say “no you should be skeptical level 4.3″. For all either of us knows from the conversation so far is that I’m less skeptical than you!
Be careful not to cargo-cult skepticism
Yes, I think this is a dangerous failure mode...but I think it’s likely that the flipside is a more common failure out there in the real world.
dealing with a set of hundreds of people … the most reasonable hypothesis is that it was useful.
Maybe? I’d have to think about that more. However, being the most reasonable doesn’t mean there are no other reasonable hypotheses. We should work hard to increase the “reasonableness delta” so that we can increase our confidence.
There’s millions of otherwise reasonable and successful people (often people I respect) telling me that they talk to ghosts or that loading up on Vitamin Omega Delta Seventeen Power Plus is the key to perfect health.
Combine that with the fact that this intervention seems pretty susceptible to not being able to distinguish between “feels useful” and “is useful” and this is the largest item raising my skepticism level. Whether this is true could be an interesting line of inquiry.
We just have to use other things like “does not conflict with how I think the world works” and “these people are generally (in)correct about subjects in this area” and “how likely is it that they tricked themselves” to weight the evidence of “these people say this is useful” to adjust our skepticism level towards the appropriate amount.
Millions of people tell me lots of things like “statins work” and “drunk driving increases risk of bad things”. I’m less skeptical of those things because of the same sort of things mentioned in the previous paragraph.
To me, by far the strongest point in reducing the skepticism level is the studies backing up TAP. In fact, it seems like this point is so strong that I’m a little confused by the proportion of your comment directed towards “believing people” vs “TAP studies”.
I’m not interested enough in the subject to dive into the studies, but if I was I’d really be looking into whatever delta existed between CFAR’s practices and the literature. (Besides trying to generate the hard, sterilized data, of course)
All that being said, I’m still not sure if I’m more or less skeptical than you on the subject.
this intervention seems pretty susceptible to not being able to distinguish between “feels useful” and “is useful”
I’m curious to hear more of your model here, even if all you have is something half-baked. Like, if you would be willing to ELI5 why this intervention seems susceptible in this way, or paint me a picture of someone thinking that it’s useful but being wrong.
There’s millions of otherwise reasonable and successful people (often people I respect) telling me that they talk to ghosts or that loading up on Vitamin Omega Delta Seventeen Power Plus is the key to perfect health.
… I am surprised by this. Mostly, I’m surprised by you assessing those people as otherwise reasonable. I think I view people’s capacity for reason as less compartmentalized, or something, and would find myself suspicious of all of their other conclusions if they talked to ghosts or loaded up on VOD17P+. Like, this wouldn’t stop them from being right-for-the-wrong-reasons, but I just wouldn’t be able to call them reasonable.
I do note that while the set of CFAR participants is not stellar in some absolute sense, it contains a much higher base rate of healthy skepticism and epistemic diligence/hygiene than most groups. Like, CFAR participants on the whole are a self-selected “at least nominally cares about what’s actually true” group, and I think I weight their self-reports accordingly? I trust the CFAR participants somewhere in between my trust for [college juniors majoring in fields that require grounding and feedback loops] and [college professors teaching in such fields], as a rough attempt to calibrate.
A variety of methods all pointing in similar direction (post-workshop surveys, frequency of use and mention in one-on-ones with participants in the years afterward, takeup in the broader rationalist/EA/longtermist communities, mention in posts in other places like FB and LW, iteration and variation in content put out by alumni, etc). Dan Keys could probably say more concrete things.
I guess I should’ve also stressed the difference between alumni thinking it’s proven useful (which is easy to determine) and it actually being useful in accomplishing stuff (which is the more interesting and harder to determine part).
None of the things you mention seem like fool-proof methods of determining the second case.
True, but also Bayes. Be careful not to cargo-cult skepticism; when you’re dealing with a set of hundreds of people all telling you that Thing X was useful for them personally (especially when Thing X came from a menu of things, and it’s being highlighted particularly), the most reasonable hypothesis is that it was useful.
Like, yes, people self-deceive, and yes, people misdiagnose. But it would be genuinely silly and not particularly rational for someone in my shoes, being directly exposed to these reports, to weirdly privilege the “nuh-uh” hypothesis.
Seems reasonable for you, one degree removed, to maintain a little skepticism. But I think people … fetishize? … a certain kind of hard, sterilized data, instead of recognizing that that certain kind of hard, sterilized data is one path to confidence, among many. It is indeed important to be able to prove mathematically that something is a triangle, and you can build a much taller tower of knowledge on a foundation that sturdy, but you can also just look, and recognize triangles quite effectively.
TAPs are useful for humans, as literally dozens of studies have shown. They have also proven useful to CFAR’s alumni, who are also humans, and this is not a surprising result.
I’m not exactly claiming that it is or is not useful in improving outcomes, I’m just wondering if any “hard, sterilized” data exists.
Anyway...
Your comment does make me wonder some things. Take the following more as me exploring my state of mind on a lazy Sunday afternoon and less as a retort or rebuttal.
I’m not sure that we actually disagree...it’s hard to quantify the amount of skepticism we’re both talking about. You can’t say “you should be skeptical level 3” and I can’t say “no you should be skeptical level 4.3″. For all either of us knows from the conversation so far is that I’m less skeptical than you!
Yes, I think this is a dangerous failure mode...but I think it’s likely that the flipside is a more common failure out there in the real world.
Maybe? I’d have to think about that more. However, being the most reasonable doesn’t mean there are no other reasonable hypotheses. We should work hard to increase the “reasonableness delta” so that we can increase our confidence.
There’s millions of otherwise reasonable and successful people (often people I respect) telling me that they talk to ghosts or that loading up on Vitamin Omega Delta Seventeen Power Plus is the key to perfect health.
Combine that with the fact that this intervention seems pretty susceptible to not being able to distinguish between “feels useful” and “is useful” and this is the largest item raising my skepticism level. Whether this is true could be an interesting line of inquiry.
We just have to use other things like “does not conflict with how I think the world works” and “these people are generally (in)correct about subjects in this area” and “how likely is it that they tricked themselves” to weight the evidence of “these people say this is useful” to adjust our skepticism level towards the appropriate amount.
Millions of people tell me lots of things like “statins work” and “drunk driving increases risk of bad things”. I’m less skeptical of those things because of the same sort of things mentioned in the previous paragraph.
To me, by far the strongest point in reducing the skepticism level is the studies backing up TAP. In fact, it seems like this point is so strong that I’m a little confused by the proportion of your comment directed towards “believing people” vs “TAP studies”.
I’m not interested enough in the subject to dive into the studies, but if I was I’d really be looking into whatever delta existed between CFAR’s practices and the literature. (Besides trying to generate the hard, sterilized data, of course)
All that being said, I’m still not sure if I’m more or less skeptical than you on the subject.
I’m curious to hear more of your model here, even if all you have is something half-baked. Like, if you would be willing to ELI5 why this intervention seems susceptible in this way, or paint me a picture of someone thinking that it’s useful but being wrong.
… I am surprised by this. Mostly, I’m surprised by you assessing those people as otherwise reasonable. I think I view people’s capacity for reason as less compartmentalized, or something, and would find myself suspicious of all of their other conclusions if they talked to ghosts or loaded up on VOD17P+. Like, this wouldn’t stop them from being right-for-the-wrong-reasons, but I just wouldn’t be able to call them reasonable.
I do note that while the set of CFAR participants is not stellar in some absolute sense, it contains a much higher base rate of healthy skepticism and epistemic diligence/hygiene than most groups. Like, CFAR participants on the whole are a self-selected “at least nominally cares about what’s actually true” group, and I think I weight their self-reports accordingly? I trust the CFAR participants somewhere in between my trust for [college juniors majoring in fields that require grounding and feedback loops] and [college professors teaching in such fields], as a rough attempt to calibrate.