Adding to Vulture’s reply (that you can not make absolute positive statements about truth), the modern view of “Occam’s razor” (at least in Bayesian thought) is the minimum description length (MDL) principle (http://en.wikipedia.org/wiki/Minimum_description_length), which can be rigorously formalized. In this formalism, it becomes a prior over models. Multiplied with the likelihood over models (derived from data), this gives you a posterior. In this posterior, if you have two models that make exactly the same predictions, the simpler one is preferred (note that the more complicated one isn’t completely rejected; it’s just given lower posterior probability).
There are very deep fundamental theoretical considerations for why MDL is a very good way of assigning a prior to beliefs. If someone wants to reject Occam’s razor, they would have to give an alternative system and show that under the assumptions of MDL it gives better long-term utility. Or that the assumptions of MDL are unfounded.
That paper doesn’t seem to be arguing against Occam’s razor. Rather it seems to be making the more specific point that model complexity on training data doesn’t necessarily mean worse generalization error. I didn’t read through the whole article so I can’t say if the arguments make sense, but it seems that if you follow the procedure of updating your posteriors as new data arrives, the point is moot. Besides, the complexity prior framework doesn’t make that claim at all.
Adding to Vulture’s reply (that you can not make absolute positive statements about truth), the modern view of “Occam’s razor” (at least in Bayesian thought) is the minimum description length (MDL) principle (http://en.wikipedia.org/wiki/Minimum_description_length), which can be rigorously formalized. In this formalism, it becomes a prior over models. Multiplied with the likelihood over models (derived from data), this gives you a posterior. In this posterior, if you have two models that make exactly the same predictions, the simpler one is preferred (note that the more complicated one isn’t completely rejected; it’s just given lower posterior probability).
There are very deep fundamental theoretical considerations for why MDL is a very good way of assigning a prior to beliefs. If someone wants to reject Occam’s razor, they would have to give an alternative system and show that under the assumptions of MDL it gives better long-term utility. Or that the assumptions of MDL are unfounded.
Perhaps you can comment this opinion that “simpler models are always more likely” is false: http://www2.denizyuret.com/ref/domingos/www.cs.washington.edu/homes/pedrod/papers/dmkd99.pdf
That paper doesn’t seem to be arguing against Occam’s razor. Rather it seems to be making the more specific point that model complexity on training data doesn’t necessarily mean worse generalization error. I didn’t read through the whole article so I can’t say if the arguments make sense, but it seems that if you follow the procedure of updating your posteriors as new data arrives, the point is moot. Besides, the complexity prior framework doesn’t make that claim at all.