Another important comment occurred to me—sorry it’s late.
During the very first minicamps (the current workshops are agreed to be better) we randomized admission of 15 applicants, with 17 controls. Our study was low-powered and effects on e.g. income would have needed to be very large for us to expect to detect them. Still, we ended up with non-negligible evidence of absence: income, happiness, and exercise did not visibly trend upward one year later. [...] The details will be available soon on our blog (including a much larger number of negative results). We’ll run another RCT soon, funding permitting.
This is really exciting, as I saw CFAR doing an RCT as one of the cool things that really made me feel like CFAR “gets it” and is committed to measuring their own impact and caring about whether they’re impactful in a way that is not just mere speculation, which is good (warning: lots of nuance missing from this sentence).
However, I’m a bit disappointed to see little in the way of CFAR explicitly reacting to this negative evidence. It seems to me to be stated (which is really good!) but then ignored (which could be bad!). What is CFAR’s plans in response to this RCT? If it’s just fund another/better RCT, what is the status of that funding and how high of a priority is it? What long-run effects on CFAR will RCTs/measurement have? Would there ever be a situation where CFAR would shut down / admit they aren’t an equally compelling donation opportunity, based on RCT or other evidence?
I think this conversation is a time when numerical hypotheses are helpful; I personally did not expect the CFAR minicamp to increase income over the next year, happiness, or exercise, but thought if there was a discernible effect it was more likely to be positive than negative. A year is a short time as far as income is concerned; happiness is very hard to adjust; a weekend motivational retreat is unlikely to be effective at altering exercise relative to other interventions. (I exercise more now than I did before, primarily thanks to Beeminder, which shows up a lot in CFAR circles and some on LW, and I think I started that more than a year after going to CFAR the first time.)
Now, if the CFAR staff had put high probability on having success on one of those three fronts, then I think that logic is worth discussing.
a weekend motivational retreat is unlikely to be effective at altering exercise
I agree about income and happiness, but I would expect CFAR to at least boost exercise, as (a) it doesn’t seem hard and (b) to be exactly the kind of thing CFAR is trying to do. I don’t know much about the specifics of the RCT with regard to statistical power, etc., however.
However, A lot of my questions in my previous comment weren’t aimed specifically at the current RCT, but at the bigger picture overall here. For example, if CFAR wasn’t putting high probability on having success with these three fronts, then why were they the dependent variables for the RCT? And what does CFAR put high probability of having success on? How do they plan on measuring that?
For example, if CFAR wasn’t putting high probability on having success with these three fronts, then why were they the dependent variables for the RCT?
We were not putting high probability on it—the RCT had few participants but a large number of questions, which we launched knowing full well that it was unlikely to tell us much and that most results would likely be negative (and that any results with e.g. p=.05 would probably be statistical flukes, given the number of comparisons), specifically so we could figure out which hypotheses to test more carefully later.
We’ll be continuing with small, not-bankruptingly-expensive tests this year. If a large targeted donation could be found, we could of course do more of this faster; if anyone’s interested they should talk to me. We’ll also be continuing to rapidly shift the curriculum as we get informal impressions/feedback from our workshops and from the continuing stream of new units that we try on volunteers, in response mostly to our intuitive impressions but also to more formal tests.
(The RCT is not an attempt to conform to an effective altruism ritual—if such ritual was imposed on CFAR’s structure without thinking carefully about what we’re actually trying to do, such attempts would probably do more harm than good to our mission, in the manner of Feynman’s “Cargo cult science”. The RCT is just a part of a much larger set of attempts to figure out how to create a effective, clear-thinking do-gooding—and to avoid deluding ourselves while we do this.)
I’m looking forward to talking with you on Skype—thanks for signing up for a timeslot—this’ll probably be easier to discuss in person.
“if the CFAR staff had put high probability on having success on one of those three fronts, then I think that logic is worth discussing.”
It would seem somewhat strange for CFAR to test three variables they did not expect to increase...
Also I do not think happiness is very hard to adjust. There is research that some simple things can improve your happiness and have been tested with RCT’s. E.g. meditation and gratitude lists had a measurable effect.
Another important comment occurred to me—sorry it’s late.
This is really exciting, as I saw CFAR doing an RCT as one of the cool things that really made me feel like CFAR “gets it” and is committed to measuring their own impact and caring about whether they’re impactful in a way that is not just mere speculation, which is good (warning: lots of nuance missing from this sentence).
However, I’m a bit disappointed to see little in the way of CFAR explicitly reacting to this negative evidence. It seems to me to be stated (which is really good!) but then ignored (which could be bad!). What is CFAR’s plans in response to this RCT? If it’s just fund another/better RCT, what is the status of that funding and how high of a priority is it? What long-run effects on CFAR will RCTs/measurement have? Would there ever be a situation where CFAR would shut down / admit they aren’t an equally compelling donation opportunity, based on RCT or other evidence?
I think this conversation is a time when numerical hypotheses are helpful; I personally did not expect the CFAR minicamp to increase income over the next year, happiness, or exercise, but thought if there was a discernible effect it was more likely to be positive than negative. A year is a short time as far as income is concerned; happiness is very hard to adjust; a weekend motivational retreat is unlikely to be effective at altering exercise relative to other interventions. (I exercise more now than I did before, primarily thanks to Beeminder, which shows up a lot in CFAR circles and some on LW, and I think I started that more than a year after going to CFAR the first time.)
Now, if the CFAR staff had put high probability on having success on one of those three fronts, then I think that logic is worth discussing.
I agree about income and happiness, but I would expect CFAR to at least boost exercise, as (a) it doesn’t seem hard and (b) to be exactly the kind of thing CFAR is trying to do. I don’t know much about the specifics of the RCT with regard to statistical power, etc., however.
However, A lot of my questions in my previous comment weren’t aimed specifically at the current RCT, but at the bigger picture overall here. For example, if CFAR wasn’t putting high probability on having success with these three fronts, then why were they the dependent variables for the RCT? And what does CFAR put high probability of having success on? How do they plan on measuring that?
We were not putting high probability on it—the RCT had few participants but a large number of questions, which we launched knowing full well that it was unlikely to tell us much and that most results would likely be negative (and that any results with e.g. p=.05 would probably be statistical flukes, given the number of comparisons), specifically so we could figure out which hypotheses to test more carefully later.
We’ll be continuing with small, not-bankruptingly-expensive tests this year. If a large targeted donation could be found, we could of course do more of this faster; if anyone’s interested they should talk to me. We’ll also be continuing to rapidly shift the curriculum as we get informal impressions/feedback from our workshops and from the continuing stream of new units that we try on volunteers, in response mostly to our intuitive impressions but also to more formal tests.
(The RCT is not an attempt to conform to an effective altruism ritual—if such ritual was imposed on CFAR’s structure without thinking carefully about what we’re actually trying to do, such attempts would probably do more harm than good to our mission, in the manner of Feynman’s “Cargo cult science”. The RCT is just a part of a much larger set of attempts to figure out how to create a effective, clear-thinking do-gooding—and to avoid deluding ourselves while we do this.)
I’m looking forward to talking with you on Skype—thanks for signing up for a timeslot—this’ll probably be easier to discuss in person.
“if the CFAR staff had put high probability on having success on one of those three fronts, then I think that logic is worth discussing.”
It would seem somewhat strange for CFAR to test three variables they did not expect to increase...
Also I do not think happiness is very hard to adjust. There is research that some simple things can improve your happiness and have been tested with RCT’s. E.g. meditation and gratitude lists had a measurable effect.