Meta: This seems like a 101-level question, so I ask it in the Open Thread rather than using the questions feature on LW.
Suppose you are designing an experiment for a blood pressure drug. I’ve heard stuff about how it is important to declare what metrics you are interested in up front, before collecting the data, not afterwards. Ie. you’d declare up front that you only care about measuring systolic blood pressure, diastolic blood pressure, HDL cholesterol, and LDL cholesterol. And then if you happen to see an effect on heart rate, you are supposed to ignore it.
Why is this? Surely just for social reasons, right? If we do the experiment and happen to see data on heart rate, The Way of Bayes says we are forced to update our beliefs. Why would we ignore those updated beliefs?
Maybe this is getting at the thing (I don’t know what it’s called) where, if you flip a coin 100 times every day for 10 years, some of those days are going to be extreme results, but that doesn’t mean that coin is weighted. Since you’re doing it so much, you’d expect it to happen. Or something like that. I don’t think my example is quite the same thing since it’s the same coin. A better example is probably when you do a blood test on someone, if you test 1000 different metrics, a few of them are bound to give “statistically significant results” just by chance. But this just seems like a phenomena that you need to adjust for, not that you should ignore data.
FWIW, I think this is well above 101-level and gets into some pretty deep issues, and sparked some pretty acrimonious debates back during the Sequences when Eliezer blogged about stopping rules & frequentism. It’s related to the question “why Bayesians don’t need, in theory, to randomize”, which is something Andrew Gelman mentioned for years before I even began to understand what he was getting at.
Oh that stuff is good to know. Thanks for those clarifications. I actually don’t see how it’s related to randomization though, so add that to the list of things I’m confused about. My question feels like a question of what to do with the data you got, regardless of whether you utilized randomization in order to get that data.
It’s the same question because it screens off the data-generating process. A researcher who is biased or p-hacking or outcome switch is like a world which generates imbalanced/confounded experimental vs ‘control’ groups, in a Bayesian needs to model the data-generating process like the stopping rule to learn correctly from, while pre-registration and explicit randomization make the results independent of those and a simple generative model is correct.
(So this is why you can get a decision-theoretic justification for Bayesians doing those even if they are sure they are modeling correctly all confounding etc: because it is a ‘nothing up my sleeve’-esque design which allows sharing information with other agents who have nonshared priors—by committing to a randomization or pre-registration, they can simply take your data at face-value and do an update, while if they had to model you as a non-randomized generating process generating arbitrarily biased data in unknown ways, the data would be uninformative and lose almost all of its possible value.)
Bayes has nothing to do with the concept of statistical significance. Statistical significance is a concept out of frequentist statistics. A concept that comes with a lot of problems.
Nobody really argues that you should ignore it. If you would want drug approval you likely even would have to list it as a potential side effect. That’s why the increased lighting strike risk of the Moderna vaccine was disclosed. It’s just that your study doesn’t provide good evidence for the effect existing. If you want that evidence, you can run another study to look for it.
Meta: This seems like a 101-level question, so I ask it in the Open Thread rather than using the questions feature on LW.
Suppose you are designing an experiment for a blood pressure drug. I’ve heard stuff about how it is important to declare what metrics you are interested in up front, before collecting the data, not afterwards. Ie. you’d declare up front that you only care about measuring systolic blood pressure, diastolic blood pressure, HDL cholesterol, and LDL cholesterol. And then if you happen to see an effect on heart rate, you are supposed to ignore it.
Why is this? Surely just for social reasons, right? If we do the experiment and happen to see data on heart rate, The Way of Bayes says we are forced to update our beliefs. Why would we ignore those updated beliefs?
Maybe this is getting at the thing (I don’t know what it’s called) where, if you flip a coin 100 times every day for 10 years, some of those days are going to be extreme results, but that doesn’t mean that coin is weighted. Since you’re doing it so much, you’d expect it to happen. Or something like that. I don’t think my example is quite the same thing since it’s the same coin. A better example is probably when you do a blood test on someone, if you test 1000 different metrics, a few of them are bound to give “statistically significant results” just by chance. But this just seems like a phenomena that you need to adjust for, not that you should ignore data.
FWIW, I think this is well above 101-level and gets into some pretty deep issues, and sparked some pretty acrimonious debates back during the Sequences when Eliezer blogged about stopping rules & frequentism. It’s related to the question “why Bayesians don’t need, in theory, to randomize”, which is something Andrew Gelman mentioned for years before I even began to understand what he was getting at.
Oh that stuff is good to know. Thanks for those clarifications. I actually don’t see how it’s related to randomization though, so add that to the list of things I’m confused about. My question feels like a question of what to do with the data you got, regardless of whether you utilized randomization in order to get that data.
It’s the same question because it screens off the data-generating process. A researcher who is biased or p-hacking or outcome switch is like a world which generates imbalanced/confounded experimental vs ‘control’ groups, in a Bayesian needs to model the data-generating process like the stopping rule to learn correctly from, while pre-registration and explicit randomization make the results independent of those and a simple generative model is correct.
(So this is why you can get a decision-theoretic justification for Bayesians doing those even if they are sure they are modeling correctly all confounding etc: because it is a ‘nothing up my sleeve’-esque design which allows sharing information with other agents who have nonshared priors—by committing to a randomization or pre-registration, they can simply take your data at face-value and do an update, while if they had to model you as a non-randomized generating process generating arbitrarily biased data in unknown ways, the data would be uninformative and lose almost all of its possible value.)
Bayes has nothing to do with the concept of statistical significance. Statistical significance is a concept out of frequentist statistics. A concept that comes with a lot of problems.
Nobody really argues that you should ignore it. If you would want drug approval you likely even would have to list it as a potential side effect. That’s why the increased lighting strike risk of the Moderna vaccine was disclosed. It’s just that your study doesn’t provide good evidence for the effect existing. If you want that evidence, you can run another study to look for it.