A remarkable empirical finding across many scientific fields, at many different scales and levels of abstraction, is that a small set of control variables usually suffices.
I’m skeptical that this is true for most things we care about. It’s true in the scientific fields where we have the most accurate models, such as physics, but that’s likely because there are so few relevant variables in those fields.
Most new drugs that go into clinical trials fail. Essentially, a pharmaceutical company identifies a variable that appears to be the mediator of a medical outcome, they create a drug that tweaks that variable, and then it turns out not to produce the outcome that they thought it would. There are too many other relevant variables that are poorly understood.
The other thing that makes me skeptical is the effectiveness of machine learning models that use a large number of inputs. It’s possible that there’s a simple underlying structure to what they’re predicting that we just haven’t figured out yet, but based on what exists now, it sure looks like there are a large number of relevant variables.
Most new drugs that go into clinical trials fail. Essentially, a pharmaceutical company identifies a variable that appears to be the mediator of a medical outcome, they create a drug that tweaks that variable, and then it turns out not to produce the outcome that they thought it would. There are too many other relevant variables that are poorly understood.
I love this example in particular, because as I understand it, this is exactly what pharma companies do not do. What they actually do is target some variable which is correlated with the medical outcome, but is often not causal and is rarely a mediator.
Case in point: amyloid beta plaques in Alzheimers.
Decades ago, people noticed that if you look at the brains of old people with dementia, they usually have lots of plaques, and these plaques are made of a particular protein fragment called amyloid beta. Therefore clearly amyloid beta causes dementia. Pretty soon people were using amyloid beta plaques to diagnose dementia, which made it really easy to show that the plaques cause dementia: when the plaques are how we diagnose “dementia”, then by golly removing the plaques makes the “dementia” (as diagnosed by plaques) go away.
As far as I can tell, there has never at any point in time been compelling evidence that amyloid beta plaques cause age-related memory problems. Conversely, I have seen at least a few studies suggesting the plaques are not causal.
Meanwhile, according to wikipedia, 244 Alzheimer’s drugs were tested in clinical trials from 2002-2012, mostly targeting the amyloid plaques. Of those, only 1 drug made it through.
I think someone familiar with both causality/mediation and the Alzheimers literature could probably have told you in 2000 that those trials were unlikely to pass. But it turns out correct reasoning about causality/mediation is remarkably rare; remember that Pearl & co’s work is still very recent by academic standards, and most people in the sciences still don’t know about it. Pharma execs don’t have the technical skills for it. Some scientists do this sort of reasoning intuitively, but saying “no” to lots of stupid drug tests is not the sort of thing which makes one a “team player” at a big pharma company. (And besides, if the problem is hard enough, you can probably get more drugs to market by throwing lots of shit at the wall and hoping one passes by random chance; I wouldn’t put my money on that one drug which passed out of 244 actually being very effective.)
I’m skeptical that this is true for most things we care about. It’s true in the scientific fields where we have the most accurate models, such as physics, but that’s likely because there are so few relevant variables in those fields.
Most new drugs that go into clinical trials fail. Essentially, a pharmaceutical company identifies a variable that appears to be the mediator of a medical outcome, they create a drug that tweaks that variable, and then it turns out not to produce the outcome that they thought it would. There are too many other relevant variables that are poorly understood.
The other thing that makes me skeptical is the effectiveness of machine learning models that use a large number of inputs. It’s possible that there’s a simple underlying structure to what they’re predicting that we just haven’t figured out yet, but based on what exists now, it sure looks like there are a large number of relevant variables.
I love this example in particular, because as I understand it, this is exactly what pharma companies do not do. What they actually do is target some variable which is correlated with the medical outcome, but is often not causal and is rarely a mediator.
Case in point: amyloid beta plaques in Alzheimers.
Decades ago, people noticed that if you look at the brains of old people with dementia, they usually have lots of plaques, and these plaques are made of a particular protein fragment called amyloid beta. Therefore clearly amyloid beta causes dementia. Pretty soon people were using amyloid beta plaques to diagnose dementia, which made it really easy to show that the plaques cause dementia: when the plaques are how we diagnose “dementia”, then by golly removing the plaques makes the “dementia” (as diagnosed by plaques) go away.
As far as I can tell, there has never at any point in time been compelling evidence that amyloid beta plaques cause age-related memory problems. Conversely, I have seen at least a few studies suggesting the plaques are not causal.
Meanwhile, according to wikipedia, 244 Alzheimer’s drugs were tested in clinical trials from 2002-2012, mostly targeting the amyloid plaques. Of those, only 1 drug made it through.
I think someone familiar with both causality/mediation and the Alzheimers literature could probably have told you in 2000 that those trials were unlikely to pass. But it turns out correct reasoning about causality/mediation is remarkably rare; remember that Pearl & co’s work is still very recent by academic standards, and most people in the sciences still don’t know about it. Pharma execs don’t have the technical skills for it. Some scientists do this sort of reasoning intuitively, but saying “no” to lots of stupid drug tests is not the sort of thing which makes one a “team player” at a big pharma company. (And besides, if the problem is hard enough, you can probably get more drugs to market by throwing lots of shit at the wall and hoping one passes by random chance; I wouldn’t put my money on that one drug which passed out of 244 actually being very effective.)