What is the strongest effect you ever found in this way?
I haven’t compiled my results into a table or anything but IIRC, I think the largest effect size so far was taking vitamin D at bedtime with d~=-0.7. (Roughly inline with psychology meta-analyses: effect sizes drop off sharply past |0.6|.)
I don’t think this is a good way to think about confounding. For one thing, you are implicitly assuming the effect is monotonic. Perhaps this is true with nootropics (how do you know though?)
The background research and published experiments don’t seem to include unusual adjustments for non-monotonicity (not really sure what that means in this context).
Monotonicity is not true in general, though.
In general? Do you have a meta-analysis over hundreds of different kinds of experiments showing this?
Actually I am not even talking about the response to the treatment. Suppose you were a werewolf, and the outcome you were measuring was a physical test. Now, every few days out of 28 you would measure off the charts completely independently of whatever physical enhancement treatment you were taking, just because you were half-wolf during those days. So you might conclude there is an effect under the null. Now werewolves do not exist, but are you sure this sort of thing doesn’t happen with you? How do you know?
Wouldn’t this be covered by randomization? If I randomize each day to this treatment, half of the wolf-days will be under treatment days and half under control days. They’ll inflate the standard deviation and I’ll be much less likely to reject the null.
I think that’s a curious attitude for someone who is into self-experimentation (independently of whether the opaque objection can be made clear or not).
From the sound of it, you’re largely making the theoretician’s objection: “but there are a billion ways your simple design could go wrong! How can you do any experiments if you don’t understand in detail every underlying tool or theorem?” Well, yes, it’s true that I nor other experimenters can’t rule out becoming a werewolf on every 5th Tuesday or in setting up an experiment with completely wrong blocks or washouts, nor can we be sure that induction will continue to work tomorrow and we will not be eaten by grues or bleens, but nevertheless...
(not really sure what that means in this context).
I am just saying that confounding could make your effect weaker (if there is cancellation of paths), or stronger (if there is some sort of interaction with the treatment), or weaker sometimes and stronger other times. You just don’t know. Confounding doesn’t just increase the variance of your effect estimate, it creates bias in the estimate. That is, if you add up some confounded bits to your estimate, you are adding up garbage.
Wouldn’t this be covered by randomization?
No. The werewolf example is a clear case of the copies not being exchangeable. Different versions of you could react to (randomized!) treatment differently, and you won’t know how without more assumptions. For instance, if you were a woman, you would have a different hormonal composition due to the monthly cycle, etc. etc. etc.
From the sound of it, you’re largely making the theoretician’s objection: “but there are a billion ways your
simple design could go wrong!”
Look, what I am saying is not very complicated. I am not asking you to become a mathematician. You are looking for causal effects. That’s great! It is not my goal to discourage you! Just report your assumptions. All of them. Say you assume monotonicity, exchangeability of copies, etc. If you don’t know what assumptions you need to make, maybe read up on them. Reporting assumptions is good science, right? It’s standard practice in the stats literature.
No, see. The burden of proof is not on me. If you make an assumption, the burden of proof that it holds (or at the very least the burden of reporting) is on you. Causal mechanisms in general are not monotonic...Just report your assumptions. All of them. Say you assume monotonicity, exchangeability of copies, etc. If you don’t know what assumptions you need to make, maybe read up on them.
This is an example of what I mean by you are taking a wildly impractical theoretical approach. Have you ever seen an experiment in which every assumption is reported with a proof? No, because such a paper would not be an experiment but an exercise in pure mathematics or statistics and no one would ever get anything done if they tried to actually apply your suggestions since they would spend all their time reading up on various statistical frameworks and going ‘well, I guess I should specify this and that assumption but wait don’t I also assume independence of who’s the current Justice of the Supreme Court?’ etc
But don’t just assume some random thing you came up with after reading some slice of the literature that happened to catch your fancy will give you the effect you want.
I hate to break it to you, but that’s pretty much how it works. People read a slice of the literature, apply simple common models, which yield reasonable answers, and only start delving into the foundations and examining closely the methods if someone makes a good case that a hidden assumption or a method’s limitation is important. This should not dismay you any more than a philosopher of science should be dismayed that scientists spend their days in the lab and he is only consulted to deal with borderline cases like Intelligent Design.
Reporting assumptions is standard practice. For example in causal inference literature the mantra is often “we assume SUTVA (stable unit treatment value assumption), and conditional ignorability.” You can’t prove them all (in fact many are untestable). Reporting is still a good idea (for sensitivity analysis, replication, arguing about their reasonableness, etc.)
Exchangeability of copies and monotonicity are pretty important. People always report monotonicity (because you get identification when you could not before). But anyways, I shouldn’t be the one to have to tell you this.
Also, it’s not some, it’s all assumptions needed to get your answer from the data. Even if exchangeability holds for you, it might not hold for someone else who might want to try your design. If you don’t write down what you assume, how should they know if your design will carry over?
Anyways, this is just the Scruffy AI mistake all over again. Actually it’s worse than that. The scientific attitude is to try to falsify, e.g. look for reasons your model might fail. You are assuming as a default that your model is reasonable, and not even leaving a paper trail.
I haven’t compiled my results into a table or anything but IIRC, I think the largest effect size so far was taking vitamin D at bedtime with d~=-0.7. (Roughly inline with psychology meta-analyses: effect sizes drop off sharply past |0.6|.)
The background research and published experiments don’t seem to include unusual adjustments for non-monotonicity (not really sure what that means in this context).
In general? Do you have a meta-analysis over hundreds of different kinds of experiments showing this?
Wouldn’t this be covered by randomization? If I randomize each day to this treatment, half of the wolf-days will be under treatment days and half under control days. They’ll inflate the standard deviation and I’ll be much less likely to reject the null.
From the sound of it, you’re largely making the theoretician’s objection: “but there are a billion ways your simple design could go wrong! How can you do any experiments if you don’t understand in detail every underlying tool or theorem?” Well, yes, it’s true that I nor other experimenters can’t rule out becoming a werewolf on every 5th Tuesday or in setting up an experiment with completely wrong blocks or washouts, nor can we be sure that induction will continue to work tomorrow and we will not be eaten by grues or bleens, but nevertheless...
I am just saying that confounding could make your effect weaker (if there is cancellation of paths), or stronger (if there is some sort of interaction with the treatment), or weaker sometimes and stronger other times. You just don’t know. Confounding doesn’t just increase the variance of your effect estimate, it creates bias in the estimate. That is, if you add up some confounded bits to your estimate, you are adding up garbage.
No. The werewolf example is a clear case of the copies not being exchangeable. Different versions of you could react to (randomized!) treatment differently, and you won’t know how without more assumptions. For instance, if you were a woman, you would have a different hormonal composition due to the monthly cycle, etc. etc. etc.
Look, what I am saying is not very complicated. I am not asking you to become a mathematician. You are looking for causal effects. That’s great! It is not my goal to discourage you! Just report your assumptions. All of them. Say you assume monotonicity, exchangeability of copies, etc. If you don’t know what assumptions you need to make, maybe read up on them. Reporting assumptions is good science, right? It’s standard practice in the stats literature.
This is an example of what I mean by you are taking a wildly impractical theoretical approach. Have you ever seen an experiment in which every assumption is reported with a proof? No, because such a paper would not be an experiment but an exercise in pure mathematics or statistics and no one would ever get anything done if they tried to actually apply your suggestions since they would spend all their time reading up on various statistical frameworks and going ‘well, I guess I should specify this and that assumption but wait don’t I also assume independence of who’s the current Justice of the Supreme Court?’ etc
I hate to break it to you, but that’s pretty much how it works. People read a slice of the literature, apply simple common models, which yield reasonable answers, and only start delving into the foundations and examining closely the methods if someone makes a good case that a hidden assumption or a method’s limitation is important. This should not dismay you any more than a philosopher of science should be dismayed that scientists spend their days in the lab and he is only consulted to deal with borderline cases like Intelligent Design.
Reporting assumptions is standard practice. For example in causal inference literature the mantra is often “we assume SUTVA (stable unit treatment value assumption), and conditional ignorability.” You can’t prove them all (in fact many are untestable). Reporting is still a good idea (for sensitivity analysis, replication, arguing about their reasonableness, etc.)
That’s reporting some assumptions, and presumably ones who have earned their being specifically singled out.
Exchangeability of copies and monotonicity are pretty important. People always report monotonicity (because you get identification when you could not before). But anyways, I shouldn’t be the one to have to tell you this.
Also, it’s not some, it’s all assumptions needed to get your answer from the data. Even if exchangeability holds for you, it might not hold for someone else who might want to try your design. If you don’t write down what you assume, how should they know if your design will carry over?
Anyways, this is just the Scruffy AI mistake all over again. Actually it’s worse than that. The scientific attitude is to try to falsify, e.g. look for reasons your model might fail. You are assuming as a default that your model is reasonable, and not even leaving a paper trail.