I suppose of late I’ve updated that pre-paradigmatic research is the very sort where you’re trying to come up with a metric to validate, not the type of research where you start with one. Note that if a project claims to be doing this kind of work, it means you need a way larger amount of evidence to determine that the project is doing useful things, and your prior will always be very low.
In the particular case, I expect that if CFAR did put in the work to measure e.g. big five changes, even if they found a notable effect size they’d say “Well, this doesn’t confirm for me that we’re on the right track, because this isn’t at all what I’m trying to measure” or something.
Added: Though I too would be really interested to know if CFAR workshops lead to any substantial changes in ravens matrices, and to a lesser-but-still-large extent any changes in Big 5.
… pre-paradigmatic research is the very sort where you’re trying to come up with a metric to validate, not the type of research where you start with one.
This makes sense, though even if doing the pre-paradigmatic thing it seems useful & low-cost to benchmark your performance on existing metrics.
In this specific case, I bet workshop participants would actually find it fun + worthwhile to take before/after big 5 & Raven’s surveys, so it could be a value-add in addition to a benchmarking metric.
In this specific case, I bet workshop participants would actually find it fun + worthwhile to take before/after big 5 & Raven’s surveys, so it could be a value-add in addition to a benchmarking metric.
Note that workshop participants already do a fair amount of answering questions beforehand (and a year later) to give a sense of how they progress, which I think actually ties in more with what the program is supposed to teach.
(My recollection was that the survey approximately maxed out the amount of time/attention I was willing to spend on surveys, although I’m not sure)
Huh. I’m surprised that after finding significant changes on well-validated psychological instruments in the 2015 study, CFAR didn’t incorporate these instruments into their pre- / post-workshop assessments.
Also surprised that they dropped them from the 2017 impact analysis.
The 2017 impact analysis seems to be EA safety focused. When their theory of impact is about EA safety it’s plausible to me that this made analysis by standard metrics less important for them.
I suppose of late I’ve updated that pre-paradigmatic research is the very sort where you’re trying to come up with a metric to validate, not the type of research where you start with one. Note that if a project claims to be doing this kind of work, it means you need a way larger amount of evidence to determine that the project is doing useful things, and your prior will always be very low.
In the particular case, I expect that if CFAR did put in the work to measure e.g. big five changes, even if they found a notable effect size they’d say “Well, this doesn’t confirm for me that we’re on the right track, because this isn’t at all what I’m trying to measure” or something.
Added: Though I too would be really interested to know if CFAR workshops lead to any substantial changes in ravens matrices, and to a lesser-but-still-large extent any changes in Big 5.
This makes sense, though even if doing the pre-paradigmatic thing it seems useful & low-cost to benchmark your performance on existing metrics.
In this specific case, I bet workshop participants would actually find it fun + worthwhile to take before/after big 5 & Raven’s surveys, so it could be a value-add in addition to a benchmarking metric.
Note that workshop participants already do a fair amount of answering questions beforehand (and a year later) to give a sense of how they progress, which I think actually ties in more with what the program is supposed to teach.
(My recollection was that the survey approximately maxed out the amount of time/attention I was willing to spend on surveys, although I’m not sure)
Huh. I’m surprised that after finding significant changes on well-validated psychological instruments in the 2015 study, CFAR didn’t incorporate these instruments into their pre- / post-workshop assessments.
Also surprised that they dropped them from the 2017 impact analysis.
The 2017 impact analysis seems to be EA safety focused. When their theory of impact is about EA safety it’s plausible to me that this made analysis by standard metrics less important for them.
Do you mean “AI safety focused”?