Two arguments—or maybe two formulations of the one argument—for complexity reducing probability, and I think the juxtaposition explains why it doesn’t feel like complexity should be a straight-up penalty for a theory.
The human-level argument for complexity reducing probability something like A∩B is more probable than A∩B∩C because the second has three fault-lines, so to speak, and the first only has two, so the second is more likely to crack. edit: equally or more likely, not strictly more likely. (For engineers out there; I have found this metaphor to be invaluable both in spotting this in conversation, and explaining this in conversation to people). As byrnema noted down below, that doesn’t seem applicable here, at least not in the direct simpler = better way, especially when having the same predictions seems to indicate that A, B, and C are all right.
The formal argument for complexity penalty (and this is philosophy, so bear with me) is that a priori, having absolutely no experiences about the universe so that all premises are equally likely (with nothing to privilege any of them, they default… the universal prior, if you like) - the theory with the fewest conjunctions of premises is the most likely by virtue of probability theory.
Now, we are restricted in our observations, because they don’t tell us what actually is; they merely tell us that anything that predicts the outcome is, and everything that doesn’t predict the outcome, isn’t. This includes adhoc theories and overcomplicated theories like “Odin made Horus made God made the universe as we know it.” However, we can extend that previous argument: Given that our observations have narrowed the universe as we know it to this section of hypotheses, we have no experiences that say something about any of the hypotheses in that section. So, a priori, all possible premises within that section are equally likely. So we should choose the one with the least conjunctions of premises, according to probability theory.
This doesn’t really get to the heart of the matter addressed in the post, but it does justify a form of complexity-as-penalty that has some bearing: namely, that if Hamiltonian requires less premises than Lagrangian, and predictions bear out both of these systems out equally well, Hamiltonian is more probable, because it is less likely to be wrong due to a false premise somewhere in the area we haven’t yet accessed. (In formal logic, Lagrangian is probably using some premise it doesn’t need to).
The human-level argument for complexity reducing probability something like A∩B is more probable than A∩B∩C because the second has three fault-lines, so to speak, and the first only has two, so the second is more likely to crack.
Strictly speaking, the Pr(A∩B) ≥ Pr(A∩B∩C), not Pr(A∩B) > Pr(A∩B∩C). Otherwise, excellent post.
Uuuhhhh, wait, there’s something wrong with your post. A simple logical statement can imply a complex-looking logical statement, right? Imagine that C is a very simple statement that implies B which is very complex. Then A∩B∩C is logically equivalent to A∩C, which is simpler than A∩B because C is simpler than B by assumption. Whoops.
You can make a statement more complex by adding more conjunctions or by adding more disjunctions. In general, the complexity of a statement about the world has no direct bearing on the prior probability we ought to assign to it. My previous post (linked from this one) talks about that.
Two arguments—or maybe two formulations of the one argument—for complexity reducing probability, and I think the juxtaposition explains why it doesn’t feel like complexity should be a straight-up penalty for a theory.
The human-level argument for complexity reducing probability something like A∩B is more probable than A∩B∩C because the second has three fault-lines, so to speak, and the first only has two, so the second is more likely to crack. edit: equally or more likely, not strictly more likely. (For engineers out there; I have found this metaphor to be invaluable both in spotting this in conversation, and explaining this in conversation to people). As byrnema noted down below, that doesn’t seem applicable here, at least not in the direct simpler = better way, especially when having the same predictions seems to indicate that A, B, and C are all right.
The formal argument for complexity penalty (and this is philosophy, so bear with me) is that a priori, having absolutely no experiences about the universe so that all premises are equally likely (with nothing to privilege any of them, they default… the universal prior, if you like) - the theory with the fewest conjunctions of premises is the most likely by virtue of probability theory. Now, we are restricted in our observations, because they don’t tell us what actually is; they merely tell us that anything that predicts the outcome is, and everything that doesn’t predict the outcome, isn’t. This includes adhoc theories and overcomplicated theories like “Odin made Horus made God made the universe as we know it.” However, we can extend that previous argument: Given that our observations have narrowed the universe as we know it to this section of hypotheses, we have no experiences that say something about any of the hypotheses in that section. So, a priori, all possible premises within that section are equally likely. So we should choose the one with the least conjunctions of premises, according to probability theory.
This doesn’t really get to the heart of the matter addressed in the post, but it does justify a form of complexity-as-penalty that has some bearing: namely, that if Hamiltonian requires less premises than Lagrangian, and predictions bear out both of these systems out equally well, Hamiltonian is more probable, because it is less likely to be wrong due to a false premise somewhere in the area we haven’t yet accessed. (In formal logic, Lagrangian is probably using some premise it doesn’t need to).
Strictly speaking, the Pr(A∩B) ≥ Pr(A∩B∩C), not Pr(A∩B) > Pr(A∩B∩C). Otherwise, excellent post.
Oh dear. Thanks for pointing that out! Going to fix it.
Uuuhhhh, wait, there’s something wrong with your post. A simple logical statement can imply a complex-looking logical statement, right? Imagine that C is a very simple statement that implies B which is very complex. Then A∩B∩C is logically equivalent to A∩C, which is simpler than A∩B because C is simpler than B by assumption. Whoops.
You can make a statement more complex by adding more conjunctions or by adding more disjunctions. In general, the complexity of a statement about the world has no direct bearing on the prior probability we ought to assign to it. My previous post (linked from this one) talks about that.