I’d like to point out a source of confusion around Occam’s Razor that I see you’re falling for, dispelling it will make things clearer: “you should not multiplicate entities without necessities!”. This means that Occam’s Razor helps decide between competing theories if and only if they have the same explanation and predictive power. But in the history of science, it was almost never the case that competing theories had the same power. Maybe it happened a couple of times (epicycles, the Copenhagen interpretation), but in all other instances a theory was selected not because it was simpler, but because it was much more powerful.
Contrary to popular misconception, Occam’s razor gets to be used very, very rarely.
We do have, anyway, a formalization of that principle in algorithmic information theory: Solomonoff induction. A agent that, to predict the outcome of a sequence, places the highest probabilities in the shortest compatible programs, will eventually outperform every other class of predictor. The catch here is the word ‘eventually’: in every measure of complexity, there’s a constant that offset the values due to the definition of the reference universal Turing machine. Different references will indicate different complexities for the same first programs, but all measure will converge after a finite amount.
This is also why I think that the problem explaining thunders with “Thor vs clouds” is such a poor example of Occam’s razor: Solomonoff induction is a formalization of Occam razor for theories, not explanations. Due to the aforementioned constant, you cannot have absolutely simpler model of a finite sequence of event. There’s no such a thing, it will always depend on the complexity of the starting Turing machine. However, you can have eventually simpler models of infinite sequence of events (infinite sequence predictor are equivalent to programs). In that case, the natural event program will prevail because it will allow to control better the outcomes.
I’d like to point out a source of confusion around Occam’s Razor that I see you’re falling for, dispelling it will make things clearer: “you should not multiplicate entities without necessities!”. This means that Occam’s Razor helps decide between competing theories if and only if they have the same explanation and predictive power. But in the history of science, it was almost never the case that competing theories had the same power. Maybe it happened a couple of times (epicycles, the Copenhagen interpretation), but in all other instances a theory was selected not because it was simpler, but because it was much more powerful.
Contrary to popular misconception, Occam’s razor gets to be used very, very rarely.
We do have, anyway, a formalization of that principle in algorithmic information theory: Solomonoff induction. A agent that, to predict the outcome of a sequence, places the highest probabilities in the shortest compatible programs, will eventually outperform every other class of predictor. The catch here is the word ‘eventually’: in every measure of complexity, there’s a constant that offset the values due to the definition of the reference universal Turing machine. Different references will indicate different complexities for the same first programs, but all measure will converge after a finite amount.
This is also why I think that the problem explaining thunders with “Thor vs clouds” is such a poor example of Occam’s razor: Solomonoff induction is a formalization of Occam razor for theories, not explanations. Due to the aforementioned constant, you cannot have absolutely simpler model of a finite sequence of event. There’s no such a thing, it will always depend on the complexity of the starting Turing machine. However, you can have eventually simpler models of infinite sequence of events (infinite sequence predictor are equivalent to programs). In that case, the natural event program will prevail because it will allow to control better the outcomes.