Imagine you are a Good Scientist. You know about p-hacking and the replication crisis. You want to follow all best practices. You want to be doing Good Science!
You’re designing an experiment to detect if there’s a correlation between two variables.
Your Good Scientist has gone off the rails already. Why do they want to know if there’s a correlation between two variables? What use is a correlation?
I am not seeing where your Bayesian Scientist is doing any better. He’s dropped p-values and adopted a prior, but he’s still just looking for correlations and expressing results according to the Bayesian ritual instead of the Frequentist ritual. But nobody cares whether smokers tend to be taller or shorter than non-smokers. They care about whether smoking stunts growth. A Truly Good Scientist needs to be looking for causal structures and mechanisms.
Looking for causal structures and mechanisms do entail (among other things) doing correlations. Would your critic still be valid if he had used a different example? He could have chosen anything else, the example was used to illustrate a point.
Exactly as mukashi was saying, the correlation is purely an example of something I want to find out about the world. The process of drawing inferences from correlations could be improved too, but that’s a different topic, and not really relevant for the central point of this post.
The point I’m raising is independent of the example. “Looking for a correlation” is never the beginning of an enquiry, and, pace mukashi, is not necessarily a part of the enquiry. What is this Scientist really wanting to study? What is the best way to study that?
I work with biologists who study plants, trying to work out how various things happen, such as the development of leaf shapes, or the development of the different organs of flowers, or the process of building cell walls out of cellulose fibrils. Whatever correlations they might from time to time measure, that is subordinate to questions of what genes are being expressed where, and how biological structures get assembled.
That may be the case, but I think that is peripheral to the point of this post. If for some reason I wanted to find out the value of a variable (and this variable could be anything, including a correlation), how would I go about doing it.
I am taking the point of the post to be as indicated in the title and the lead: creating a model for doing Empirical Science. Finding out the value of a variable — especially one with no physical existence, like a correlation between two other variables — is a very small part of science.
Your Good Scientist has gone off the rails already. Why do they want to know if there’s a correlation between two variables? What use is a correlation?
I am not seeing where your Bayesian Scientist is doing any better. He’s dropped p-values and adopted a prior, but he’s still just looking for correlations and expressing results according to the Bayesian ritual instead of the Frequentist ritual. But nobody cares whether smokers tend to be taller or shorter than non-smokers. They care about whether smoking stunts growth. A Truly Good Scientist needs to be looking for causal structures and mechanisms.
Looking for causal structures and mechanisms do entail (among other things) doing correlations. Would your critic still be valid if he had used a different example? He could have chosen anything else, the example was used to illustrate a point.
Exactly as mukashi was saying, the correlation is purely an example of something I want to find out about the world. The process of drawing inferences from correlations could be improved too, but that’s a different topic, and not really relevant for the central point of this post.
The point I’m raising is independent of the example. “Looking for a correlation” is never the beginning of an enquiry, and, pace mukashi, is not necessarily a part of the enquiry. What is this Scientist really wanting to study? What is the best way to study that?
I work with biologists who study plants, trying to work out how various things happen, such as the development of leaf shapes, or the development of the different organs of flowers, or the process of building cell walls out of cellulose fibrils. Whatever correlations they might from time to time measure, that is subordinate to questions of what genes are being expressed where, and how biological structures get assembled.
That may be the case, but I think that is peripheral to the point of this post. If for some reason I wanted to find out the value of a variable (and this variable could be anything, including a correlation), how would I go about doing it.
I am taking the point of the post to be as indicated in the title and the lead: creating a model for doing Empirical Science. Finding out the value of a variable — especially one with no physical existence, like a correlation between two other variables — is a very small part of science.