FWIW, I think this is well above 101-level and gets into some pretty deep issues, and sparked some pretty acrimonious debates back during the Sequences when Eliezer blogged about stopping rules & frequentism. It’s related to the question “why Bayesians don’t need, in theory, to randomize”, which is something Andrew Gelman mentioned for years before I even began to understand what he was getting at.
Oh that stuff is good to know. Thanks for those clarifications. I actually don’t see how it’s related to randomization though, so add that to the list of things I’m confused about. My question feels like a question of what to do with the data you got, regardless of whether you utilized randomization in order to get that data.
It’s the same question because it screens off the data-generating process. A researcher who is biased or p-hacking or outcome switch is like a world which generates imbalanced/confounded experimental vs ‘control’ groups, in a Bayesian needs to model the data-generating process like the stopping rule to learn correctly from, while pre-registration and explicit randomization make the results independent of those and a simple generative model is correct.
(So this is why you can get a decision-theoretic justification for Bayesians doing those even if they are sure they are modeling correctly all confounding etc: because it is a ‘nothing up my sleeve’-esque design which allows sharing information with other agents who have nonshared priors—by committing to a randomization or pre-registration, they can simply take your data at face-value and do an update, while if they had to model you as a non-randomized generating process generating arbitrarily biased data in unknown ways, the data would be uninformative and lose almost all of its possible value.)
FWIW, I think this is well above 101-level and gets into some pretty deep issues, and sparked some pretty acrimonious debates back during the Sequences when Eliezer blogged about stopping rules & frequentism. It’s related to the question “why Bayesians don’t need, in theory, to randomize”, which is something Andrew Gelman mentioned for years before I even began to understand what he was getting at.
Oh that stuff is good to know. Thanks for those clarifications. I actually don’t see how it’s related to randomization though, so add that to the list of things I’m confused about. My question feels like a question of what to do with the data you got, regardless of whether you utilized randomization in order to get that data.
It’s the same question because it screens off the data-generating process. A researcher who is biased or p-hacking or outcome switch is like a world which generates imbalanced/confounded experimental vs ‘control’ groups, in a Bayesian needs to model the data-generating process like the stopping rule to learn correctly from, while pre-registration and explicit randomization make the results independent of those and a simple generative model is correct.
(So this is why you can get a decision-theoretic justification for Bayesians doing those even if they are sure they are modeling correctly all confounding etc: because it is a ‘nothing up my sleeve’-esque design which allows sharing information with other agents who have nonshared priors—by committing to a randomization or pre-registration, they can simply take your data at face-value and do an update, while if they had to model you as a non-randomized generating process generating arbitrarily biased data in unknown ways, the data would be uninformative and lose almost all of its possible value.)