I was wondering this myself. I roughly knew of Solomonoff Induction as related… but apparently that is equivalent! The next thing my memory turned up was “Minimum Description Length” principle, which as it turns out… is also a version of Occam’s Razor. Funny how that works.
If we look at the original question again… “If two hypotheses fit the same observations equally well, why believe the simpler one is more likely to be true?” If I understand the conjunction fallacy correctly, it is strictly true that adding more propositions cannot increase the probability.That is to say, P( A & B) ⇐ P(B)… and P( A & B) ⇐ P(A).
So the argument could be made that B might have probability one and therefore would be an equally probable hypothesis with its addition. So if you start with A, and B has probability less than one it will strictly lower the probability to include it. Thus as far as I can tell, Occam’s Razor holds except where additional propositions have probability one.
...But if they have probability one, wouldn’t they have to be axiomatically identical to just having proposition A? Or would it perhaps have to be probability one given A? I honestly don’t know enough here, but I think the basic idea stands?
As Richard Kennaway has said, this only deals with cases where one hypothesis is a conjunction including another (e.g., “There is a god” and “There is a god called Bill”), but most cases in which we actually want to apply OR aren’t like that; they’re more like “geocentric astronomy with circular orbits plus epicycles” and “heliocentric astronomy with elliptical orbits”.
Ah. Yeah that does clear things up a bit. What would a solution look like, then? To show the complexity of an idea impacts its probability… but unless you use the historic argument of ‘it’s looked that way in the past for stuff like this’ I don’t see any way of even approaching that.
What if we imagine the space of hypotheses? A simpler hypothesis would be a larger circle because there may be more specific rules that act in accordance with it. ‘The strength of a hypothesis is not what it can explain, but what it fails to account for’, so a complicated prediction should occupy a very tiny region and therefore have a tiny probability.
Or… is that just another version of Solomonoff Induction, and so the same thing?
Near as I can tell, you’re describing the same conjunction rule from your previous comment!
This conjunction rule says that a claim like ‘The laws of physics always hold,’ has less probability than, ‘The laws of physics hold up until September 25, 2015 (whether or not they continue to hold after).’
Solomonoff Induction is an attempt to find a rule that says, ‘OK, but the first claim accounts for nearly all of the probability assigned to the second claim.’
I was wondering this myself. I roughly knew of Solomonoff Induction as related… but apparently that is equivalent! The next thing my memory turned up was “Minimum Description Length” principle, which as it turns out… is also a version of Occam’s Razor. Funny how that works.
If we look at the original question again… “If two hypotheses fit the same observations equally well, why believe the simpler one is more likely to be true?” If I understand the conjunction fallacy correctly, it is strictly true that adding more propositions cannot increase the probability.That is to say, P( A & B) ⇐ P(B)… and P( A & B) ⇐ P(A).
So the argument could be made that B might have probability one and therefore would be an equally probable hypothesis with its addition. So if you start with A, and B has probability less than one it will strictly lower the probability to include it. Thus as far as I can tell, Occam’s Razor holds except where additional propositions have probability one.
...But if they have probability one, wouldn’t they have to be axiomatically identical to just having proposition A? Or would it perhaps have to be probability one given A? I honestly don’t know enough here, but I think the basic idea stands?
As Richard Kennaway has said, this only deals with cases where one hypothesis is a conjunction including another (e.g., “There is a god” and “There is a god called Bill”), but most cases in which we actually want to apply OR aren’t like that; they’re more like “geocentric astronomy with circular orbits plus epicycles” and “heliocentric astronomy with elliptical orbits”.
Ah. Yeah that does clear things up a bit. What would a solution look like, then? To show the complexity of an idea impacts its probability… but unless you use the historic argument of ‘it’s looked that way in the past for stuff like this’ I don’t see any way of even approaching that.
What if we imagine the space of hypotheses? A simpler hypothesis would be a larger circle because there may be more specific rules that act in accordance with it. ‘The strength of a hypothesis is not what it can explain, but what it fails to account for’, so a complicated prediction should occupy a very tiny region and therefore have a tiny probability.
Or… is that just another version of Solomonoff Induction, and so the same thing?
Near as I can tell, you’re describing the same conjunction rule from your previous comment!
This conjunction rule says that a claim like ‘The laws of physics always hold,’ has less probability than, ‘The laws of physics hold up until September 25, 2015 (whether or not they continue to hold after).’
Solomonoff Induction is an attempt to find a rule that says, ‘OK, but the first claim accounts for nearly all of the probability assigned to the second claim.’
Hrm, yeah. I think I need more tools and experience to be able to think about this properly.