I don’t have a good enough understanding of Bayesian statistics to make the argument in full, but I do know that very smart people have tried to combine it with causal models and concluded that it doesn’t work.
Do you know of any references on the problems people have run into? I’ve used Bayesian inference on causal models in my own day-to-day work quite a bit without running into any fundamental issues (other than computational difficulty), and what I’ve read of people using them in neuroscience and ML generally seems to match that. So it sounds like knowledge has failed to diffuse—either the folks using this stuff haven’t heard about some class of problems with it, or the causal modelling folks are insufficiently steeped in Bayesian inference to handle the tricky bits.
From my perspective, as someone who is not well trained in Bayesian methods and does not pretend to understand the issue well, I just observe that methodological work on causal models very rarely uses Bayesian statistics, that I myself do not see an obvious way to integrate it, and that most of the smart people working on causal inference appear to be skeptical of such attempts
Ok, after reading these, it’s sounding a lot more like the main problem is causal inference people not being very steeped in Bayesian inference. Robins, Hernan and Wasserman’s argument is based on a mistake that took all of ten minutes to spot: they show that a particular quantity is independent of propensity score function if the true parameters of the model are known, then jump to the estimate of that quantity being independent of propensity—when in fact, the estimate is dependent on the propensity, because the estimates of the model parameters depend on propensity. Pearl’s argument is more abstract and IMO stronger, but is based on the idea that causal relationships are not statistically testable… when in fact, that’s basically the bread-and-butter use-case for Bayesian model comparison.
Some time in the next week I’ll write up a post with a few full examples (including the one from Robins, Hernan and Wasserman), and explain in a bit more detail.
(Side note: I suspect that the reason smart people have had so much trouble here is that the previous generation was mostly introduced to Bayesian statistics by Savage or Gelman; I expect someone who started with Jaynes would have a lot less trouble here, but his main textbook is relatively recent.)
Some time in the next week I’ll write up a post with a few full examples (including the one from Robins, Hernan and Wasserman), and explain in a bit more detail.
I look forward to reading it. To be honest: Knowing these authors, I’d be surprised if you have found an error that breaks their argument.
We are now discussing questions that are so far outside of my expertise that I do not have the ability to independently evaluate the arguments, so I am unlikely to contribute further to this particular subthread (i.e. to the discussion about whether there exists an obvious and superior Bayesian solution to the problem I am trying to solve).
Do you know of any references on the problems people have run into? I’ve used Bayesian inference on causal models in my own day-to-day work quite a bit without running into any fundamental issues (other than computational difficulty), and what I’ve read of people using them in neuroscience and ML generally seems to match that. So it sounds like knowledge has failed to diffuse—either the folks using this stuff haven’t heard about some class of problems with it, or the causal modelling folks are insufficiently steeped in Bayesian inference to handle the tricky bits.
I don’t have a great reference for this.
A place to start might be Judea Pearl’s essay “Why I’m only half-Bayesian” at https://ftp.cs.ucla.edu/pub/stat_ser/r284-reprint.pdf . If you look at his Twitter account at @yudapearl, you will also see numerous tweets where he refers to Bayes Theorem as a “trivial identity” and where he talks about Bayesian statistics as “spraying priors on everything”. See for example https://twitter.com/yudapearl/status/1143118757126000640 and his discussions with Frank Harrell.
Another good read may be Robins, Hernan and Wasserman’s letter to the editor at Biometrics, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4667748/ . While that letter is not about graphical models, the propensity scores/marginal structural models are mathematically very closely related. The main argument in that letter (which was originally a blog post) has been discussed on Less Wrong before; I am trying to find the discussion, it may be this link https://www.lesswrong.com/posts/xdh5FPMYYGGX7PBKj/the-trouble-with-bayes-draft
From my perspective, as someone who is not well trained in Bayesian methods and does not pretend to understand the issue well, I just observe that methodological work on causal models very rarely uses Bayesian statistics, that I myself do not see an obvious way to integrate it, and that most of the smart people working on causal inference appear to be skeptical of such attempts
Ok, after reading these, it’s sounding a lot more like the main problem is causal inference people not being very steeped in Bayesian inference. Robins, Hernan and Wasserman’s argument is based on a mistake that took all of ten minutes to spot: they show that a particular quantity is independent of propensity score function if the true parameters of the model are known, then jump to the estimate of that quantity being independent of propensity—when in fact, the estimate is dependent on the propensity, because the estimates of the model parameters depend on propensity. Pearl’s argument is more abstract and IMO stronger, but is based on the idea that causal relationships are not statistically testable… when in fact, that’s basically the bread-and-butter use-case for Bayesian model comparison.
Some time in the next week I’ll write up a post with a few full examples (including the one from Robins, Hernan and Wasserman), and explain in a bit more detail.
(Side note: I suspect that the reason smart people have had so much trouble here is that the previous generation was mostly introduced to Bayesian statistics by Savage or Gelman; I expect someone who started with Jaynes would have a lot less trouble here, but his main textbook is relatively recent.)
I look forward to reading it. To be honest: Knowing these authors, I’d be surprised if you have found an error that breaks their argument.
We are now discussing questions that are so far outside of my expertise that I do not have the ability to independently evaluate the arguments, so I am unlikely to contribute further to this particular subthread (i.e. to the discussion about whether there exists an obvious and superior Bayesian solution to the problem I am trying to solve).