Mini-review: The Book of Why
Someone should probably write a real book review, but to make a brief recommendation: The Book of Why by Judea Pearl and Dana Mackenzie is probably the most interesting general-science book I’ve read since Thinking Fast and Slow.
Pearl’s goal is to explain and promote causal inference, which you might think of as (allegedly) the next big thing after frequentist and Bayesian statistics. The introduction is probably skippable, since the authors make some rather grand claims that aren’t backed up until later. I found myself thinking, “okay, maybe it’s great, but explain what it is already”.
Chapter 1 introduces the Ladder of Causation, the authors’ way of distinguishing the correlations found via a model-free statistical summary of data (which is level 1) from deductions that require a causal model (levels 2 and 3).
Chapters 2 and 3 give a partial, “whiggish” history of statistics from a causal perspective, covering frequentist and Bayesian statistics and Pearl’s AI work, when he invented Bayesian networks. At the end, he talks about the possible junctions in a Bayesian network: the chain, fork, and collider, and how they can easily cause confusion.
Chapter 4 uses causal reasoning to explain the logic behind randomized controlled trials and other ways of controlling for confounding variables.
Chapter 5 covers the scientific debate over cigarette smoking, and how lack of clarity about causation resulted in this debate taking years longer than it needed to.
Chapter 6 is a fun chapter showing how to use causal diagrams to shed new light on the Monty Hall problem and Simpson’s paradox.
And that’s as far as I’ve read, but it’s enough to make a strong recommendation.
I did a quick search on Less Wrong and causality has been covered before, though not as clearly. In particular, see Yudkowsky’s Causal Diagrams and Causal Models.
(I was confused about one bit, though: Yudkowsky writes that “Causal models (with specific probabilities attached) are sometimes known as ‘Bayesian networks’ or ‘Bayes nets’.” But in the book, the authors make a clear distinction: “Unlike the causal diagrams we will deal with throughout the book, a Bayesian network carries no assumption that the arrow has any causal meaning.” Though later, they write, “These three junctions [...] are like keyholes through the door that separates the first and second levels of the Ladder of Causation.”)
Here’s a nice introduction to causal inference in a machine learning context:
ML beyond Curve Fitting: An Intro to Causal Inference and do-Calculus
Formally, a Bayes net simply defines a probability distribution in a way that makes probabilistic inference easier. However, a common way to build a Bayes net is to only include an edge from X to Y if X causes Y, in which case the Bayes net you build can be interpreted as a causal model.
Here’s an earlier paper by Judea Pearl:
Bayesianism and Causality, or, Why I am Only a Half-Bayesian
http://ftp.cs.ucla.edu/pub/stat_ser/r284-reprint.pdf
Does the book make any comments about the kind of data you need to be working with?
I have been toying with the notion of looking at a few sets of historical data, and trying to use causal reasoning to establish qualitative causality, even if quantitative is too much to ask.
I’m not sure this will help in your case, but the usual framework for using causality for calculations seems to be that you have a DAG respresenting the causal connections between variables (without probabilities) and statistical data. From this, some things can be calculated that couldn’t be inferred with statistical data alone.
The cause graph can’t usually be inferred from the data. However, some statistical tests could disprove the cause graph. For example, the cause graph might imply that certain statistical variables are independent.