Well, of course I can’t give the right answer if the right answer depends on information you’ve just specified I don’t
have.
I think there is “the right answer” here, and I think it does not rely on observing the confounder. If your decision theory does then (a) your decision theory isn’t as smart as it could be, and (b) you are needlessly restricting yourself to certain types of decision theories.
The appropriate reference class for deciding whether to give HAART to an HIV patient is not just the set of all HIV
patients who’ve been given HAART precisely because of the possibility of confounders.
People have been thinking about confounders for a long time (earliest reference known to me to a “randomized” trial is the book of Daniel, see also this: http://ije.oxfordjournals.org/content/33/2/247.long). There is a lot of nice clever math that gets around unobserved confounders developed in the last 100 years or so. Saying “well we just need to observe confounders” is sort of silly. That’s like saying “well, if you want to solve this tricky computational problem forget about developing new algorithms and that whole computational complexity thing, and just buy more hardware.”
In actual problems people want to solve, people have the option of acquiring more information and working from
there.
I don’t know what kind of actual problems you work on, but the reality of life in stats, medicine, etc. is you have your dataset and you got to draw conclusions from it. The dataset is crappy—there is probably selection bias, all sorts of missing data, censoring, things we would really liked to have known but which were never collected, etc. This is just a fact of life for folks in the trenches in the empirical sciences/data analysis. The right answer here is not denial, but new methodology.
Thanks for your interest! The name of the area is “causal inference.” Keywords: “standardization” (in epidemiology), “confounder or covariate adjustment,” “propensity score”, “instrumental variables”, “back-door criterion,” “front-door criterion,” “g-formula”, “potential outcomes”, “ignorability,” “inverse probability weighting,” “mediation analysis,” “interference”, etc.
I think there is “the right answer” here, and I think it does not rely on observing the confounder. If your decision theory does then (a) your decision theory isn’t as smart as it could be, and (b) you are needlessly restricting yourself to certain types of decision theories.
People have been thinking about confounders for a long time (earliest reference known to me to a “randomized” trial is the book of Daniel, see also this: http://ije.oxfordjournals.org/content/33/2/247.long). There is a lot of nice clever math that gets around unobserved confounders developed in the last 100 years or so. Saying “well we just need to observe confounders” is sort of silly. That’s like saying “well, if you want to solve this tricky computational problem forget about developing new algorithms and that whole computational complexity thing, and just buy more hardware.”
I don’t know what kind of actual problems you work on, but the reality of life in stats, medicine, etc. is you have your dataset and you got to draw conclusions from it. The dataset is crappy—there is probably selection bias, all sorts of missing data, censoring, things we would really liked to have known but which were never collected, etc. This is just a fact of life for folks in the trenches in the empirical sciences/data analysis. The right answer here is not denial, but new methodology.
For non experts in the thread, what’s the name of this area and is there a particular introductory text you would recommend?
Thanks for your interest! The name of the area is “causal inference.” Keywords: “standardization” (in epidemiology), “confounder or covariate adjustment,” “propensity score”, “instrumental variables”, “back-door criterion,” “front-door criterion,” “g-formula”, “potential outcomes”, “ignorability,” “inverse probability weighting,” “mediation analysis,” “interference”, etc.
Pearl’s Causality book (http://www.amazon.com/Causality-Reasoning-Inference-Judea-Pearl/dp/052189560X/ref=pd_sim_sbs_b_1) is a good overview (but doesn’t talk a lot about statistics/estimation). Early references are Sewall Wright’s path analysis paper from 1921 (http://naldc.nal.usda.gov/download/IND43966364/PDF) and Neyman’s paper on potential outcomes from 1923 (http://www.ics.uci.edu/~sternh/courses/265/neyman_statsci1990.pdf). People say either Sewall Wright or his dad invented instrumental variables also.
Thanks