Mark Xu comments on Definitions of Causal Abstraction: Reviewing Beckers & Halpern

Mark Xu 21 Jan 2020 4:40 UTC
6 points
I don’t know if you’ve seen this, but https://arxiv.org/abs/1906.11583 is a follow-up that generalizes the Beckers and Halpern paper to a notion of approximate abstraction by measuring the non-commutativity of the diagram by using some distance function and taking expectations. I think the most useful notion that the paper introduces is the idea of a probability distribution over the set of allowed interventions. Intuitively, you don’t need your abstraction of temperature to behave nicely w.r.t freezing half the room and burning the other half such that the average kinetic energy balances out. Thus you can determine the “approximate commutativeness” of the diagram by fixing a high-level intervention and taking an expectation over the low-level interventions that were likely to map to that high-level intervention.
Also, if you are willing to write up your counter example to the conjecture that Beckers and Halpern make, I am currently researching under Eberhardt and he (and I) would be extremely interested in seeing it. I also initially thought that the conjecture was obviously false, but when I tried to actually construct counter examples, all of them ended up as either not strong abstractions or not recursive (acyclic) causal models.
- johnswentworth 22 Jan 2020 4:00 UTC
  3 points
  Parent
  Turns out the particles → fluid example doesn’t work; it’s not a $τ$ -abstraction (which makes me think the range of applicability of $τ$ -abstraction is considerably narrower than I first thought).
  That said, here’s a counterexample which I think works. Variables of the low-level model:
  - $X_{1} . . . X_{n}$ follow an arbitrary structural model
  - $σ$ is a random permutation
  - $Y_{1} . . . Y_{n}$ given by $Y_{i} = g (X_{σ (i)}, U_{i}^{Y})$
  … where U are iid noise terms. So we have some arbitrary structural model, we scramble the variables, and then we compute a function of each. For the high-level model:
  - $X_{1}^{'} . . . X_{n}^{'}$ follow the same model as $X_{1} . . . X_{n}$ in the low-level model
  - $Z_{1} . . . Z_{n}$ given by $Z_{i} = g (X_{i}^{'}, U_{i}^{Z})$
  … so it’s the same as the low-level model, but with the $Y$ variables unscrambled. The mapping between the two is what you’d expect: $τ$ maps $X \to X^{'}$ directly, and uses $σ$ to unscramble $Y$ : $Z = Y_{σ^{- 1}}$ . Then the interventions $ω_{τ}$ are similarly simple:
  - $ω_{τ} (X_{i} \leftarrow x^{*}) = (X_{i}^{'} \leftarrow x^{*})$
  - $ω_{τ} (Y_{σ^{* - 1} (i)} \leftarrow y^{*}, σ \leftarrow σ^{*}) = (Z_{i} \leftarrow y^{*})$
  Note that we can pick any $σ^{*}$ we please for the last intervention, but we do need to pick one—we can’t just leave it alone.
  I’m pretty sure this checks all the boxes for strong $τ$ -abstraction. But it isn’t a constructive $τ$ -abstraction, since all of the $Z$ ’s depend on the same low-level variable $σ$ . In principle, there could still be some other $τ$ which makes the high-level model a constructive abstraction (B&H’s definition only requires that some $τ$ exist between the two models), but I doubt it.
  Let me know if you guys spot a hole in this setup, or see an elegant way to confirm that there isn’t some other $τ$ that magically makes it constructive.