Yes, but my point is that thinking about SI or MML in the abstract helps because people sometimes gain insight from asking “How complex is that computer program?” I haven’t seen appeal-to-CEV produce much insight in practice, and any insight it could produce can probably be better produced by appealing to the relevant component principle of CEV instead. (Nor yet is this a critique of CEV, because it’s meant as an AI design, not as a moral intuition pump.)
Willam of Ockham originally used his principle to argue for the existence of God (God is the only necessary entity, therefore the simplest explanation).
That’s a truly epic fail, since Occam’s razor is the strongest argument against the existence of God.
It’s worth noting that the current formulation “entities must not be multiplied beyond necessity” is much more recent than Ockham’s original formulation “For nothing ought to be posited without a reason given, unless it is self-evident (literally, known through itself) or known by experience or proved by the authority of Sacred Scripture.”
I suppose that he included the reference to the Sacred Scripture specifically because he realized that without it, God would be the first thing to fly out of the window.
How else can you impartially wield Occam’s Razor than with a formal model, and what convincing formalization is there other than Kolmogorov Complexity (and assorted variants), which SI in a way extends?
Setting aside the theoretical objections to Solomonoff induction (a priori assumption of computability of the hypotheses, disregard of logical depth, dependance on the details of the computational model, normalization issues), even if you accept it as a proper formalization of Occam’s Razor, in order to apply it in a formal argument, you would have to perform an uncomputable calculation.
Besides noting that there are computable versions of Kolmogorov Complexity (such as MML), in your parent comment you contrasted the use of SI with using Occam’s Razor itself.
That’s what I was asking about, and it doesn’t seem like you answered it:
How do you use Occam’s Razor, what formalizations do you perceive as “proper”, or if you’re just intuiting the heuristic, guesstimating the complexity, what is the formal principle that your intuition derives from / approximates and how does it differ from e.g. Kolmogorov Complexity?
Besides noting that there are computable versions of Kolmogorov Complexity (such as MML)
If by MML you mean Minimum message length, then I don’t think that’s correct.
This paper compares Minimum message length with Kolmogorov Complexity but it doesn’t seem to make that claim.
How do you use Occam’s Razor, what formalizations do you perceive as “proper”, or if you’re just intuiting the heuristic, guesstimating the complexity, what is the formal principle that your intuition derives from / approximates and how does it differ from e.g. Kolmogorov complexity
My point is that Kolmogorov complexity, Solomonoff induction, etc., are matematical constructions with a formal semantics.
Talking about “informal” Kolmogorov complexity is pseudo-mathematics, which is usually an attempt to make your arguments sound more compelling than they are by dressing them in mathematical language.
If there is a disagreement about which hypothesis is simpler, trying to introduce concepts such as ill-defined program lengths that can’t be computed, can only obscure the terms of the debate, rather than clarifying them.
“(...) MML usually (but not necessarily) restricts the reference machine to a non-universal form in the interest of computational feasibility. (...) As a result, MML can be, and has routinely been, applied with some confidence to many problems of machine learning (...)”
If there is a disagreement about which hypothesis is simpler, trying to introduce concepts such as ill-defined program lengths that can’t be computed, can only obscure the terms of the debate, rather than clarifying them.
There will be such disagreement about many different hypotheses, and even when there’s not our common intuition will usely have approximated the informational content density of the hypotheses, their complexity.
How do you suggest to resolve such disagreements, or reach common ground without resorting to an intuition ultimately resting on complexity measures?
How do you use Occam’s Razor without an appeal a formal notion that grounds your intuition? What does your intuition rest on, if not information theory?
MML usually (but not necessarily) restricts the reference machine to a non-universal form in the interest of computational feasibility.
Sure, but IIUC (I’ve just skimmed the paper), in order to make the comparison to Kolmogorov complexity, they consider arbitrary Turing machines as their hypotheses, which makes the analysis uncomputable.
How do you use Occam’s Razor without an appeal a formal notion that grounds your intuition? What does your intuition rest on, if not information theory?
I think that’s still an open problem. Solomonoff induction is certainly an attempt towards its formalization, but it doesn’t yield anything that can be used for reasoning in practice.
Saying “my hypothesis has smaller Kolmogorov complexity than yours” is meaningless unless you can make the argument formal.
MML and KC are conceptually and theoretically highly related concepts, MML is another stab at formalizing Occam’s Razor in a more feasible manner, using the same approach as KC. No, they are not in fact identical, if that’s what you meant (hence the different names …)
Saying “my hypothesis has smaller Kolmogorov complexity than yours” is meaningless unless you can make the argument formal.
But saying “Based on Occam’s Razor, my hypothesis is smaller than yours” isn’t just as meaningless as long as your intuition stays sufficiently fuzzy and ungrounded? Is it an open problem as soon as anyone disagrees (or on what basis would you solve any dispute)? What use would the heuristic be, then?
I guess what I don’t understand is how you can embrace Occam’s Razor as an intuition, yet argue against the use of the branch of information theory that formalizes it, given there’s even computable variants. I agree that to categorically make statements about the KC of most hypotheses is misguided, and I also dislike the misuse of the terminology as mere buzzwords.
However, it is the formalism that our intuition is aspiring to emulate, and to improve our intuition would be to move it further towards the formalized basis it derives from, a move which you seem to reject.
But saying “Based on Occam’s Razor, my hypothesis is smaller than yours” isn’t just as meaningless as long as your intuition stays sufficiently fuzzy and ungrounded?
It’s not just a fuzzy intuition, you can try to count the concepts, but ultimately the argument remains informal. But throwing in “informal” Kolmogorov complexity doesn’t help, so what’s the point of doing that?
However, it is the formalism that our intuition is aspiring to emulate, and to improve our intuition would be to move it further towards the formalized basis it derives from, a move which you seem to reject.
I’m not sure that is the proper formalism, but even if it is, unless it provides actual tools to use in arguments, I think it’s not appropriate to use its terminology as buzzwords.
It’s not just a fuzzy intuition, you can try to count the concepts, but ultimately the argument remains informal.
Counting concepts is an error-prone, extremely rough approximation of complexity. A fuzzy, undependable version of it, if you will.
It falls to such problems such as (H1: A, B, C) versus (H2: A, D) with D being potentially larger or smaller than (B, C).
Or would you recommend trying to chunk out concepts of similar size? This will invariably lead you to the smallest differing unit, the smallest lexeme of your language of choice...
...and in the end, your “concept” will translate to “bit”, you’ll choose the the shortest equivalent restatement of the hypothesis with the fewest concepts (bits), and you’ll compare those. Familiar?
(t)[T]hrowing in “informal” Kolmogorov complexity doesn’t help, so what’s the point of doing that?
Think of it more as moving the intuition in the right direction. Of course that implies more than just usage of the terminology and precludes definitive statements (it’s still an intuition, not a formal calculation).
Such emphasis on the roots of our intuition can yield both positive and negative effects: Positive if used as a qualifier and a note of caution to our easily misguided “A is clearly more complex” intuitions, negative if we just append our intuition with “according to Kolmogorov Complexity” to lend unwarranted credence to our usual fallible guesstimating.
I’m not sure about what is exactly the focal point of our disagreement.
I’m not against making arguments more formal, I just don’t see how Kolmogorov complexity, Solomonoff induction, etc. can be practically used to that purpose.
Yes, but my point is that thinking about SI or MML in the abstract helps because people sometimes gain insight from asking “How complex is that computer program?” I haven’t seen appeal-to-CEV produce much insight in practice, and any insight it could produce can probably be better produced by appealing to the relevant component principle of CEV instead. (Nor yet is this a critique of CEV, because it’s meant as an AI design, not as a moral intuition pump.)
Can you provde an example where Solomonoff Induction can be used to gain insight that Occam’s razor doesn’t help to gain?
Willam of Ockham originally used his principle to argue for the existence of God (God is the only necessary entity, therefore the simplest explanation).
That’s a truly epic fail, since Occam’s razor is the strongest argument against the existence of God.
It’s worth noting that the current formulation “entities must not be multiplied beyond necessity” is much more recent than Ockham’s original formulation “For nothing ought to be posited without a reason given, unless it is self-evident (literally, known through itself) or known by experience or proved by the authority of Sacred Scripture.”
I suppose that he included the reference to the Sacred Scripture specifically because he realized that without it, God would be the first thing to fly out of the window.
I sometimes wish I knew which philosophers of the time were sincere in their religious disclaimers.
Consider it done.
My thought in leaving that comment rather than doing it myself was for V_V to get credit, but OK.
How else can you impartially wield Occam’s Razor than with a formal model, and what convincing formalization is there other than Kolmogorov Complexity (and assorted variants), which SI in a way extends?
Setting aside the theoretical objections to Solomonoff induction (a priori assumption of computability of the hypotheses, disregard of logical depth, dependance on the details of the computational model, normalization issues), even if you accept it as a proper formalization of Occam’s Razor, in order to apply it in a formal argument, you would have to perform an uncomputable calculation.
Since you can’t do that, what’s left of it?
Besides noting that there are computable versions of Kolmogorov Complexity (such as MML), in your parent comment you contrasted the use of SI with using Occam’s Razor itself.
That’s what I was asking about, and it doesn’t seem like you answered it:
How do you use Occam’s Razor, what formalizations do you perceive as “proper”, or if you’re just intuiting the heuristic, guesstimating the complexity, what is the formal principle that your intuition derives from / approximates and how does it differ from e.g. Kolmogorov Complexity?
If by MML you mean Minimum message length, then I don’t think that’s correct. This paper compares Minimum message length with Kolmogorov Complexity but it doesn’t seem to make that claim.
My point is that Kolmogorov complexity, Solomonoff induction, etc., are matematical constructions with a formal semantics. Talking about “informal” Kolmogorov complexity is pseudo-mathematics, which is usually an attempt to make your arguments sound more compelling than they are by dressing them in mathematical language.
If there is a disagreement about which hypothesis is simpler, trying to introduce concepts such as ill-defined program lengths that can’t be computed, can only obscure the terms of the debate, rather than clarifying them.
From the paper you cited:
“(...) MML usually (but not necessarily) restricts the reference machine to a non-universal form in the interest of computational feasibility. (...) As a result, MML can be, and has routinely been, applied with some confidence to many problems of machine learning (...)”
There will be such disagreement about many different hypotheses, and even when there’s not our common intuition will usely have approximated the informational content density of the hypotheses, their complexity.
How do you suggest to resolve such disagreements, or reach common ground without resorting to an intuition ultimately resting on complexity measures?
How do you use Occam’s Razor without an appeal a formal notion that grounds your intuition? What does your intuition rest on, if not information theory?
Sure, but IIUC (I’ve just skimmed the paper), in order to make the comparison to Kolmogorov complexity, they consider arbitrary Turing machines as their hypotheses, which makes the analysis uncomputable.
I think that’s still an open problem. Solomonoff induction is certainly an attempt towards its formalization, but it doesn’t yield anything that can be used for reasoning in practice. Saying “my hypothesis has smaller Kolmogorov complexity than yours” is meaningless unless you can make the argument formal.
MML and KC are conceptually and theoretically highly related concepts, MML is another stab at formalizing Occam’s Razor in a more feasible manner, using the same approach as KC. No, they are not in fact identical, if that’s what you meant (hence the different names …)
But saying “Based on Occam’s Razor, my hypothesis is smaller than yours” isn’t just as meaningless as long as your intuition stays sufficiently fuzzy and ungrounded? Is it an open problem as soon as anyone disagrees (or on what basis would you solve any dispute)? What use would the heuristic be, then?
I guess what I don’t understand is how you can embrace Occam’s Razor as an intuition, yet argue against the use of the branch of information theory that formalizes it, given there’s even computable variants. I agree that to categorically make statements about the KC of most hypotheses is misguided, and I also dislike the misuse of the terminology as mere buzzwords.
However, it is the formalism that our intuition is aspiring to emulate, and to improve our intuition would be to move it further towards the formalized basis it derives from, a move which you seem to reject.
It’s not just a fuzzy intuition, you can try to count the concepts, but ultimately the argument remains informal. But throwing in “informal” Kolmogorov complexity doesn’t help, so what’s the point of doing that?
I’m not sure that is the proper formalism, but even if it is, unless it provides actual tools to use in arguments, I think it’s not appropriate to use its terminology as buzzwords.
Counting concepts is an error-prone, extremely rough approximation of complexity. A fuzzy, undependable version of it, if you will.
It falls to such problems such as (H1: A, B, C) versus (H2: A, D) with D being potentially larger or smaller than (B, C).
Or would you recommend trying to chunk out concepts of similar size? This will invariably lead you to the smallest differing unit, the smallest lexeme of your language of choice...
...and in the end, your “concept” will translate to “bit”, you’ll choose the the shortest equivalent restatement of the hypothesis with the fewest concepts (bits), and you’ll compare those. Familiar?
Think of it more as moving the intuition in the right direction. Of course that implies more than just usage of the terminology and precludes definitive statements (it’s still an intuition, not a formal calculation).
Such emphasis on the roots of our intuition can yield both positive and negative effects: Positive if used as a qualifier and a note of caution to our easily misguided “A is clearly more complex” intuitions, negative if we just append our intuition with “according to Kolmogorov Complexity” to lend unwarranted credence to our usual fallible guesstimating.
I’m not sure about what is exactly the focal point of our disagreement.
I’m not against making arguments more formal, I just don’t see how Kolmogorov complexity, Solomonoff induction, etc. can be practically used to that purpose.