Occam’s Razor is a heuristic… and one I proceed according to- but its not at all clear just what its justification is. Why exactly ought we to believe the simpler hypothesis?
The best justification I’ve heard for believing simple hypotheses is an argument from probability.
Consider some event caused by a certain block. We know the block’s color must be either red, yellow, blue, or green; its shape must be either square, round, or triangular; its material must be either wood or metal.
We come up with two theories about the event. Both theories explain the event adequately:
The event was caused by the block being made of wood.
The event was caused by the block being blue, and triangular, and and made of metal.
Before the event happens, there are twenty four different possibile configurations of the block. “Made of wood” is true of twelve configurations, “blue, triangular, and made of metal” is true of one configuration.
After the event, we dismiss all configurations except these thirteen under which we believe the event was possible. We assume all of these thirteen are equally likely. Therefore, there’s a 12⁄13 chance that the block is made of wood and a 1⁄13 chance the block is blue, triangular, and made of metal.
Therefore, Theory 1 is twelve times more likely than Theory 2.
The same principle is at work any time you have a simple theory competing with a more complex theory. Because the complicated theory has more preconditions that have to be just right, it has a lower prior probability relative to the simple theory, and since the occurence of the event adjusts the probabilities of both theories equally, it has a lower posterior probability.
I know I read this explanation first on a discussion of Kolmogorov complexity on someone’s rationality blog, but I can’t remember who’s or what the link was. If I stole your explanation, please step up and take credit.
It also helps to keep in mind that the state space with associated probability distribution is something you dress the actual state of reality in. The model helps to keep track of the structure of the data you have about the actual state of reality, that hides in one tiny point of state space. Probabilities of areas of state space (events/hypotheses) quantitatively express the relation between those aspects of the model and reality it’s about.
You move from simpler hypotheses to more complex hypotheses for the same reason that you count from small numbers to big numbers.
Try imagining what counting the natural numbers “in the opposite order” would look like.
Of course, you can have large wiggles. For example, you might alternate jumping up to the next power of two and counting backwards. But using different representations for hypotheses leads to the same sort of wiggles in Occam’s Razor.
The best justification I’ve heard for believing simple hypotheses is an argument from probability.
Consider some event caused by a certain block. We know the block’s color must be either red, yellow, blue, or green; its shape must be either square, round, or triangular; its material must be either wood or metal.
We come up with two theories about the event. Both theories explain the event adequately:
The event was caused by the block being made of wood.
The event was caused by the block being blue, and triangular, and and made of metal.
Before the event happens, there are twenty four different possibile configurations of the block. “Made of wood” is true of twelve configurations, “blue, triangular, and made of metal” is true of one configuration.
After the event, we dismiss all configurations except these thirteen under which we believe the event was possible. We assume all of these thirteen are equally likely. Therefore, there’s a 12⁄13 chance that the block is made of wood and a 1⁄13 chance the block is blue, triangular, and made of metal.
Therefore, Theory 1 is twelve times more likely than Theory 2.
The same principle is at work any time you have a simple theory competing with a more complex theory. Because the complicated theory has more preconditions that have to be just right, it has a lower prior probability relative to the simple theory, and since the occurence of the event adjusts the probabilities of both theories equally, it has a lower posterior probability.
I know I read this explanation first on a discussion of Kolmogorov complexity on someone’s rationality blog, but I can’t remember who’s or what the link was. If I stole your explanation, please step up and take credit.
For simplicity, Occam’s razor is often cited as “choose the simplest hypothesis” even when it’s more appropriate to employ its original definition as the principle that one should favor the explanation that requires the fewest assumptions.
I agree that less_schlong shouldn’t be citing Occam’s razor as some fundamental law of the universe, but I do think it’s obvious that all things being equal, we should attempt to minimize speculative assumptions.
Occam’s Razor is a heuristic… and one I proceed according to- but its not at all clear just what its justification is. Why exactly ought we to believe the simpler hypothesis?
The best justification I’ve heard for believing simple hypotheses is an argument from probability.
Consider some event caused by a certain block. We know the block’s color must be either red, yellow, blue, or green; its shape must be either square, round, or triangular; its material must be either wood or metal.
We come up with two theories about the event. Both theories explain the event adequately:
The event was caused by the block being made of wood. The event was caused by the block being blue, and triangular, and and made of metal. Before the event happens, there are twenty four different possibile configurations of the block. “Made of wood” is true of twelve configurations, “blue, triangular, and made of metal” is true of one configuration.
After the event, we dismiss all configurations except these thirteen under which we believe the event was possible. We assume all of these thirteen are equally likely. Therefore, there’s a 12⁄13 chance that the block is made of wood and a 1⁄13 chance the block is blue, triangular, and made of metal.
Therefore, Theory 1 is twelve times more likely than Theory 2.
The same principle is at work any time you have a simple theory competing with a more complex theory. Because the complicated theory has more preconditions that have to be just right, it has a lower prior probability relative to the simple theory, and since the occurence of the event adjusts the probabilities of both theories equally, it has a lower posterior probability.
I know I read this explanation first on a discussion of Kolmogorov complexity on someone’s rationality blog, but I can’t remember who’s or what the link was. If I stole your explanation, please step up and take credit.
It also helps to keep in mind that the state space with associated probability distribution is something you dress the actual state of reality in. The model helps to keep track of the structure of the data you have about the actual state of reality, that hides in one tiny point of state space. Probabilities of areas of state space (events/hypotheses) quantitatively express the relation between those aspects of the model and reality it’s about.
You move from simpler hypotheses to more complex hypotheses for the same reason that you count from small numbers to big numbers.
Try imagining what counting the natural numbers “in the opposite order” would look like.
Of course, you can have large wiggles. For example, you might alternate jumping up to the next power of two and counting backwards. But using different representations for hypotheses leads to the same sort of wiggles in Occam’s Razor.
The best justification I’ve heard for believing simple hypotheses is an argument from probability.
Consider some event caused by a certain block. We know the block’s color must be either red, yellow, blue, or green; its shape must be either square, round, or triangular; its material must be either wood or metal.
We come up with two theories about the event. Both theories explain the event adequately:
The event was caused by the block being made of wood.
The event was caused by the block being blue, and triangular, and and made of metal.
Before the event happens, there are twenty four different possibile configurations of the block. “Made of wood” is true of twelve configurations, “blue, triangular, and made of metal” is true of one configuration.
After the event, we dismiss all configurations except these thirteen under which we believe the event was possible. We assume all of these thirteen are equally likely. Therefore, there’s a 12⁄13 chance that the block is made of wood and a 1⁄13 chance the block is blue, triangular, and made of metal.
Therefore, Theory 1 is twelve times more likely than Theory 2.
The same principle is at work any time you have a simple theory competing with a more complex theory. Because the complicated theory has more preconditions that have to be just right, it has a lower prior probability relative to the simple theory, and since the occurence of the event adjusts the probabilities of both theories equally, it has a lower posterior probability.
I know I read this explanation first on a discussion of Kolmogorov complexity on someone’s rationality blog, but I can’t remember who’s or what the link was. If I stole your explanation, please step up and take credit.
For simplicity, Occam’s razor is often cited as “choose the simplest hypothesis” even when it’s more appropriate to employ its original definition as the principle that one should favor the explanation that requires the fewest assumptions.
I agree that less_schlong shouldn’t be citing Occam’s razor as some fundamental law of the universe, but I do think it’s obvious that all things being equal, we should attempt to minimize speculative assumptions.