I’m embarrassed I didn’t finish them earlier. I’ve been procrastinating on them for years because they’re the most annoying kind of results—a lot of work to write up and then the results are just inconclusive (and predictably so due to sample size & measurement problems).
I had a similar problem with my catnip surveys: I saw even while running the first one that the results were heavily biased by an accidental demand effect*, which meant that the results would be not too solid regardless of how much I tried to salvage it in modeling, so I put it off for 3 years and it made me wince whenever I thought about how it wasn’t finished yet. (In contrast, delays in analyzing my ad A/B experiment showing apparent large harm to site traffic due to modest advertising, were for the opposite reason—I was so excited about what could be a very important, novel, interesting effect that I kept holding off to try different tweaks to analysis to make sure there wasn’t something I’d missed or an artifact of my initial model.) That sort of problem kills my motivation to finish, especially since, as the joke goes, running the experiment tales 90% of the work and then writing up the experiment takes the other 90% of the work… And I still have a bunch of half-finished experiments to write up (the CO2/sleep/Mnemosyne experiment; the second magnesium experiment; the Soylent daily variance experiment; the Long Bets experiment; the HN link submission experiment and analysis; the Mailchimp email send-time A/B tests), never mind all the usual projects.
* because I advertised it as about catnip, people selectively recalled only catnip-responding cats; you could see this bias easily because the second cat reported was much less likely to be reported as a catnip-responder even though there’s no reason to think that the first cat people reported would respond at 90%+ and second-fifth cats would be at 60-70%, and in the final analysis, if you add a covariate for ‘first cat reported’ vs second to estimate the bias and subtract it out, you get exactly the meta-analytic estimate for rate of catnip responding (2/3rds).
I’m embarrassed I didn’t finish them earlier. I’ve been procrastinating on them for years because they’re the most annoying kind of results—a lot of work to write up and then the results are just inconclusive (and predictably so due to sample size & measurement problems).
I had a similar problem with my catnip surveys: I saw even while running the first one that the results were heavily biased by an accidental demand effect*, which meant that the results would be not too solid regardless of how much I tried to salvage it in modeling, so I put it off for 3 years and it made me wince whenever I thought about how it wasn’t finished yet. (In contrast, delays in analyzing my ad A/B experiment showing apparent large harm to site traffic due to modest advertising, were for the opposite reason—I was so excited about what could be a very important, novel, interesting effect that I kept holding off to try different tweaks to analysis to make sure there wasn’t something I’d missed or an artifact of my initial model.) That sort of problem kills my motivation to finish, especially since, as the joke goes, running the experiment tales 90% of the work and then writing up the experiment takes the other 90% of the work… And I still have a bunch of half-finished experiments to write up (the CO2/sleep/Mnemosyne experiment; the second magnesium experiment; the Soylent daily variance experiment; the Long Bets experiment; the HN link submission experiment and analysis; the Mailchimp email send-time A/B tests), never mind all the usual projects.
* because I advertised it as about catnip, people selectively recalled only catnip-responding cats; you could see this bias easily because the second cat reported was much less likely to be reported as a catnip-responder even though there’s no reason to think that the first cat people reported would respond at 90%+ and second-fifth cats would be at 60-70%, and in the final analysis, if you add a covariate for ‘first cat reported’ vs second to estimate the bias and subtract it out, you get exactly the meta-analytic estimate for rate of catnip responding (2/3rds).