The A=a notation always bugged me too.
I like the above notation because it betrays morphism composition.
If we consider random variables as measure(able) spaces and conditional probabilities P(B | A) as stochastic maps B → P(A), then every element ‘a’ of (a countably generated) A induces a point measure → A giving probability 1 to that event. This is the map named by do(a). But since we’re composing maps, not elements, we can use an element a unambiguously to mean its point measure. Then a series of measures separated by ‘,’ give the product measure.
In the above example, let a : A (implicitly, → A), a’ : B (implicitly, → B), M : B ~> C, Y : (A,C) ~> D, then Y(a,M(a’)) is a stochastic map ~> D given by composition
EDIT: How do I ascii art?
All of this is a fancy way of saying that “potential outcome” notation conveys exactly the right information to make probabilities behave nicely.
The A=a notation always bugged me too. I like the above notation because it betrays morphism composition.
If we consider random variables as measure(able) spaces and conditional probabilities P(B | A) as stochastic maps B → P(A), then every element ‘a’ of (a countably generated) A induces a point measure → A giving probability 1 to that event. This is the map named by do(a). But since we’re composing maps, not elements, we can use an element a unambiguously to mean its point measure. Then a series of measures separated by ‘,’ give the product measure. In the above example, let a : A (implicitly, → A), a’ : B (implicitly, → B), M : B ~> C, Y : (A,C) ~> D, then Y(a,M(a’)) is a stochastic map ~> D given by composition
EDIT: How do I ascii art?
All of this is a fancy way of saying that “potential outcome” notation conveys exactly the right information to make probabilities behave nicely.