By the pigeonhole principle (you can’t fit 3 pigeons into 2 pigeonholes) there must be some joint probability distributions which cannot be represented in the first causal structure
Although this is a valid interpretation of the Pigeonhole Principle (PP) for some particular one-to-one cases, I think it misses the main point of it as relates to this particular example. You absolutely can fit 3 Pigeons into 2 Pigeonholes, and the standard (to my knowledge) intended takeaway from the PP is that you are gonna have to, if you want your pigeons holed. There might just not be a cute way to do it.
The idea being that in finite sets A and B with card(A)>card(B) there is no injectivef from A to B (you could see this as losing information); but you absolutely can send things from A to B, you just have to be aware that at least two originally (in A) different elements are going to be mapped onto the same element in B. This is an issue of losing uniqueness in the representation, not impossibility of the representation itself. It is even useful sometimes.
A priori, it looks possible for a function to exist by which two different joint distributions would be mapped into the same causal structure in some nice, natural or meaningful way, in the sense that only related-in-some-cute-way joined distributions would share the same representation. If there are no such natural functions, there are definitely ugly ones. You can always cram all your pigeons in the first pigeonhole. They are all represented!
On the other hand, the full joint probability distribution would have 3 degrees of freedom—a free choice of (earthquake&recession), a choice of p(earthquake&¬recession), a choice of p(¬earthquake&recession), and then a constrained p(¬earthquake&¬recession) which must be equal to 1 minus the sum of the other three, so that all four probabilities sum to 1.0.
If you get to infinite stuff, it gets worse. You actually can inject R² into R (or in this case R³ —three degrees of freedom— into R² — two degrees —), meaning that not only can you represent every 3D vector in 2D (wich we do all the time), but there are particular representations that wont be lossy, with every 3D object being uniquely represented. You won’t lose that information! (the operative term here being that. You are most definitely losing something).
If you substitute R³ and R² for [0,1]³ and [0,1]², to account for the fact that you are working in probabilities, this is still true.
So, assuming there are infinitely many things to account for in the Universe, you would be able to represent the joint distribution as causal diagrams uniquely. If there aren’t, you may be able to group them following “nice” relationships. If you can’t do that, you can always cram them willy neely. There is no need for a joint distribution to not be represented. I’m not sure how this affects the following
This means the first causal structure is falsifiable; there’s survey data we can get which would lead us to reject it as a hypothesis
Seemed weird to me that this hasn’t been pointed out (after a rather superficial look at the comments), so I’m pretty sure that either I’m missing something and there are already 2-3 similarly wrong, refuted comments on this , or this has actually been talked about already and I just haven’t seen it.
Edit: Just realized, in
If you substitute R³ and R² for [0,1]³ and [0,1]², to account for the fact that you are working in probabilities, this is still true.
should be the 1-norm unit balls from R³ and R², B3 and B², so all the components sum up to 1, instead of the [0,1]³,[0,1]². Pretty sure the injection still holds.
None of these bijections have nice properties, though. There are bijections between R³ and R², but no continuous ones. (I’m not sure if they can even be measurable.) One might criticise Eliezer’s mention of the Pigeonhole principle, but the point that he is making stands: a three-dimensional space cannot be mapped out by two real-valued parameters in any useful way. A minimal notion of “useful” here might be a local homeomorphism between manifolds, and this is clearly impossible when the dimensions are different.
There is a big leap between there are no X, so Y and there are no useful X (useful meaning local homeomorphisms), so Y, though. Also, local homeomorphism seem too strong a standard to set. But sure, I kind of agree on this. So let’s forget about injection. Orthogonal projections seem to be very useful under many standards, albeit lossy. I’m not confident that there are no akin, useful equivalence classes in A (joint probability distributions) that can be nicely map to B (causal diagrams). Either way, the conclusion
This means the first causal structure is falsifiable; there’s survey data we can get which would lead us to reject it as a hypothesis
can’t be entailed from the above alone.
Note: my model of this is just balls in Rn, so elements might not hold the same accidental properties as the ones in A and B, (if so, please explain :) ) but my underlying issue is with the actual structure of the argument.
Although this is a valid interpretation of the Pigeonhole Principle (PP) for some particular one-to-one cases, I think it misses the main point of it as relates to this particular example. You absolutely can fit 3 Pigeons into 2 Pigeonholes, and the standard (to my knowledge) intended takeaway from the PP is that you are gonna have to, if you want your pigeons holed. There might just not be a cute way to do it.
The idea being that in finite sets A and B with card(A)>card(B) there is no injective f from A to B (you could see this as losing information); but you absolutely can send things from A to B, you just have to be aware that at least two originally (in A) different elements are going to be mapped onto the same element in B. This is an issue of losing uniqueness in the representation, not impossibility of the representation itself. It is even useful sometimes.
A priori, it looks possible for a function to exist by which two different joint distributions would be mapped into the same causal structure in some nice, natural or meaningful way, in the sense that only related-in-some-cute-way joined distributions would share the same representation. If there are no such natural functions, there are definitely ugly ones. You can always cram all your pigeons in the first pigeonhole. They are all represented!
If you get to infinite stuff, it gets worse. You actually can inject R² into R (or in this case R³ —three degrees of freedom— into R² — two degrees —), meaning that not only can you represent every 3D vector in 2D (wich we do all the time), but there are particular representations that wont be lossy, with every 3D object being uniquely represented. You won’t lose that information! (the operative term here being that. You are most definitely losing something).
If you substitute R³ and R² for [0,1]³ and [0,1]², to account for the fact that you are working in probabilities, this is still true.
So, assuming there are infinitely many things to account for in the Universe, you would be able to represent the joint distribution as causal diagrams uniquely. If there aren’t, you may be able to group them following “nice” relationships. If you can’t do that, you can always cram them willy neely. There is no need for a joint distribution to not be represented. I’m not sure how this affects the following
Seemed weird to me that this hasn’t been pointed out (after a rather superficial look at the comments), so I’m pretty sure that either I’m missing something and there are already 2-3 similarly wrong, refuted comments on this , or this has actually been talked about already and I just haven’t seen it.
Edit: Just realized, in
should be the 1-norm unit balls from R³ and R², B3 and B², so all the components sum up to 1, instead of the [0,1]³,[0,1]². Pretty sure the injection still holds.
None of these bijections have nice properties, though. There are bijections between R³ and R², but no continuous ones. (I’m not sure if they can even be measurable.) One might criticise Eliezer’s mention of the Pigeonhole principle, but the point that he is making stands: a three-dimensional space cannot be mapped out by two real-valued parameters in any useful way. A minimal notion of “useful” here might be a local homeomorphism between manifolds, and this is clearly impossible when the dimensions are different.
There is a big leap between there are no X, so Y and there are no useful X (useful meaning local homeomorphisms), so Y, though. Also, local homeomorphism seem too strong a standard to set. But sure, I kind of agree on this. So let’s forget about injection. Orthogonal projections seem to be very useful under many standards, albeit lossy. I’m not confident that there are no akin, useful equivalence classes in A (joint probability distributions) that can be nicely map to B (causal diagrams). Either way, the conclusion
can’t be entailed from the above alone.
Note: my model of this is just balls in Rn, so elements might not hold the same accidental properties as the ones in A and B, (if so, please explain :) ) but my underlying issue is with the actual structure of the argument.