Reading Math: Pearl, Causal Bayes Nets, and Functional Causal Models
Hi all,
I just started a doctoral program in psychology, and my research interest concerns causal reasoning. Since Pearl’s Causality, the popularity of causal Bayes nets as psychological models for causal reasoning has really grown. Initially, I had some serious reservations, but now I’m beginning to think a great many of these are due in part to the oversimplified treatment that CBNs get in the psychology literature. For instance, the distinction between a) directed acyclic graphs + underlying conditional probabilities, and b) functional causal models, is rarely mentioned. Ignoring this distinction leads to some weird results, especially when the causal system in question has prominent physical mechanisms.
Say we represent Gear A as causing Gear B to turn because Gear A is hooked up to an engine, and because the two gears are connected to each other by a chain. Something like this:
Engine(ON) → GearA(turn) → GearB(turn)
As a causal Net, this is problematic. If I “intervene” on GearA (perform do(GearA=stop)), then I get the expected result: GearA stops, GearB stops, and the engine keeps running (the ‘undoing’ effect [Sloman, 2005]). But what happens if I “intervene” on GearB? Since they are connected by a chain, GearA would stop as well. But GearA is the cause, and GearB is the effect: intervening on effects is NOT supposed to change the status of the cause. This violates a host of underlying assumptions for causal Bayes nets. (And you can’t represent the gears as causing each other’s movement, since that’d be a cyclical graph.)
However, this can be solved if we’re not representing the system as the above net, but we’re instead representing the physics of the system, representing the forces involved via something that looks vaguely like newtonian equations. Indeed, this would accord better with people’s hypothesis-testing behavior: if they aren’t sure which gear has the engine behind it, they wouldn’t try “intervening” on GearA’s motion and GearB’s motion, they’d try removing the chain, and seeing which gear is still moving.
At first it seemed to me like causal Bayes nets only do the first kind of representation, not the latter. However, I was wrong: Pearl’s “functional causal models” appear to do the latter. These have been vastly less prevalent in the psych literature, yet they seem extremely important.
Anyways, the moral of the story is that I should really read a lot of Pearl’s Causality, and actually have a grasp of some of the math; I can’t just read the first chapter like most psychology researchers interested in this stuff.
I’m not much of an autodidact when it comes to math, though I’m good at it when put in a class. Can anyone who’s familiar with Pearl’s book give me an idea of what sort of prerequisites it would be good to have in order to understand important chunks of it? Or am I overthinking this, and I should just try and plow through.
Any suggestions on classes (or textbooks, I guess), or any thoughts on the above gears example, will be helpful and welcome.
Thanks!
EDIT: Maybe a more specific request could be phrased as following: will I be better served by taking some extra computer science classes, or some extra math classes (i.e., on calculus and probabilistic systems)?
The system you describe can’t be given by a directed acyclic graph, so why is it surprising that the theory breaks?
The theory is supposed to describe ANY causal system—otherwise it would be a crappy theory of how (normatively) people ought to reason causally, and how (descriptively) people do reason causally.
No, it’s not.
In particular, most of Pearl’s work applies only under some sort of assumption that the underlying process is Markovian. The common criticism of Pearl is that this assumption fails if one assumes quantum mechanics is true. He addresses this in Causality, around chapter two or three. He also addresses extensions to possibly-cyclic diagrams, but the technicalities become annoying.
If you are okay with discretizing time, then Timeless Causality shows a “ladder”-like directed acyclic graph that will approximate the causal system.
If his theory breaks in situations as mundane and simple as the gears example above, then why have common criticisms employed the vagaries of quantum mechanics in attempting to criticize the Markov assumption? They might as well have just used simple examples involving gears.
I don’t follow.
You made a claim of the form “For all causal systems, this theory ought to describe them.” I demonstrated otherwise by exhibiting an explicit assumption Pearl makes at the outset, and that because of this assumption the theory applies only to a subset of causal systems. Gears are classical objects, and so a simple example involving gears doesn’t elucidate the weaknesses of assuming all processes are Markov.
Then, I alluded to how one can hack around cycles in causal graphs by approximating them with “ladders”.
As far as I can tell you’re assuming some narrative between these points; there isn’t one.
Oy, I’m not following you either; apologies. You said:
...implying that people generally criticize his theory for “breaking” at quantum mechanics. That is, to find a system outside his “subset of causal systems” critics have to reach all the way to quantum mechanics. He could respond “well, QM causes a lot of trouble for a lot of theories.” Not bullet-proof, but still. However, you started (your very first comment) by saying that his theory “breaks” even in the gears example. So why have critics tried criticizing his theory for breaking in complex quantum mechanics, when all along there were much more simple and common causal situations they could have used to criticize the theory for breaking under?
More generally, I just can’t agree with your interpretation of Pearl that he was only trying to describe a subset of causal systems, if such a subset excludes such commonplace examples as the gears example. I think he was trying to describe a theory of how causation and counterfactuals can be formalized and mathemetized to describe most of nature. Perhaps this theory doesn’t apply to nature when described on the quantum mechanical level, but I find it extremely implausible that it doesn’t apply to the vast majority of nature. It was designed to. Can you really watch this video and deny he thinks that his theory applies to classical physics, such as the gears example? Or do you think he’d be stupid enough to not think of the gears example? I’m baffled by your position.
Hopefully the following clarifies my position.
In what follows, “Pearl’s causal theory” refers to all instances of Pearl’s work of which I am aware. “DAG theory” refers only to the fragment which a priori assumes all causal models are directed acyclic graphs.
Claim 1: DAG theory can’t cope with the gears example. False.
For the third time, there exists an approximation of the gears example that is a directed acyclic graph. See the link in my second comment for the relevant picture.
Claim 2: Pearl’s causal theory can’t cope with the gears example. False.
If the approximation in claim 1 doesn’t satisfy you, then there exists a messy, more computationally expensive extension of the DAG theory that can deal with cyclic causal graphs.
Claim 3: Pearl’s causal theory describes all causal systems everywhere. False.
This is the only claim to which quantum mechanics is relevant.
Thanks, that is helpful.
My claim was that, if we simply represent the gears example by representing the underlying (classical) physics of the system via Pearl’s functional causal models, there’s nothing cyclic about the system. Thus, Pearl’s causal theory doesn’t need to resort to the messy expensive stuff for such systems. It only needs to get messy in systems which are a) cyclic, and b) implausible to model via their physics—for example, negative and positive feedback loops (smoking causes cancer causes despair causes smoking).
As an aside (I didn’t want to disrupt the chain of responses, so I am putting it here):
I really wish people didn’t use “Markovian” for the adjective form. Do to an earlier immersion in science fiction, seeing “Markovian” gives me intense Well of Souls flashbacks, I actually have to stop and get back on track with the current argument. I also wish Chalker had used something else for their name, but that is too long past.
I don’t know why I have such an intense response to “Markovian” since the Well of Souls series wasn’t important to me, there is a lot of SF that I like better. I suspect it has something to do with the unusualness of the name.
The math in Causality was a bit difficult, but it wasn’t unfamiliar (if you have taken at least one class on probability), and I don’t think more classes would help much. You just sorta have to dive in and work through it.
The problem of causality in the system of gears reminds me of this comment, comparing a control system model of the stepwise algorithm “measure, compute correction, apply correction, repeat” versus the continuous reality of a centrifugal governor for a steam engine. I didn’t fully understand the discussion there, but perhaps there is some relation. (I think of arguing that the governor can be seen as the limit of increasing the update rate of the algorithmic solution; and that limit is merely best analyzed by different tools than the algorithm, not fundamentally different — but as I said, I haven’t studied the discussion, and the limit still has a cycle in it so this isn’t directly relevant to your problem.)
Thanks, I’ll check that out soon.