A theory is a model for a class of observable phenomena. A model is constructed from smaller primitive (atomic) elements connected together according to certain rules. (Ideally, the model’s behavior or structure is isomorphic to that of the class of phenomena it is intended to represent.) We can take this collection of primitive elements, plus the rules for how they can be connected, as a modeling language. Now, depending on which primitives and rules we have selected, it may become more or less difficult to express a model with behavior isomorphic to the original, requiring more or fewer primitive elements. This means that Occam’s razor will suggest different models as the simplest alternatives depending on which modeling language we have selected. Minimizing complexity in each modeling language lends a different bias toward certain models and against other models, but those biases can be varied or even reversed by changing the language that was selected. There is consequently nothing mathematically special about simplicity that lends an increased probability of correctness to simpler models.
That said, there are valid reasons to use Occam’s razor nonetheless, and not just the reasons the author of this essay lists, such as resource constraint optimization. In fact, it is reasonable to expect that using Occam’s razor does increase the probability of correctness, but not for the reasons that simplicity alone is good. Consider the fact that human beings evolved in this environment, and that our minds are therefore tailored by evolution to be good at identifying patterns that are common within it. In other words, the modeling language used for human cognition has been optimized to some degree to easily express patterns that are observable in our environment. Thus, for the specific pairing of the human environment with the modeling language used by human minds, a bias towards simpler models probably is indicative of an increased likelihood of that model being appropriate to the observed class of phenomena, despite simplicity being irrelevant in the general case of any arbitrary pairing of environment and modeling language.
You’re speaking as though complexity is measuring the relationship between a language and the phenomena, or the map and a territory. But I’m pretty sure complexity is actually an objective and language-independent idea, represented in its pure form in Salmonoff Induction. Complexity is a property that’s observed in the world via senses or data input mechanisms, not just something within the mind. The ease of expressing a certain statement might change depending on the language you’re using, but the statement’s absolute complexity remains the same no matter what. You don’t have to measure everything within the terms of one particular language, you can go outside the particulars and generalize.
you can go outside the particulars and generalize.
You can’t get to the outside. No matter what perspective you are indirectly looking from, you are still ultimately looking from your own perspective. (True objectivity is an illusion—it amounts to you imagining you have stepped outside of yourself.) This means that, for any given phenomenon you observe, you are going to have to encode that phenomenon into your own internal modeling language first to understand it, and you will therefore perceive some lower bound on complexity for the expression of that phenomenon. But that complexity, while it seems intrinsic to the phenomenon, is in fact intrinsic to your relationship to the phenomenon, and your ability to encode it into your own internal modeling language. It’s a magic trick played on us by our own cognitive limitations.
Complexity is a property that’s observed in the world via senses or data input mechanisms, not just something within the mind.
Senses and data input mechanisms are relationships. The observer and the object are related by the act of observation. You are looking at two systems, the observer and the object, and claiming that the observer’s difficulty in building a map of the object is a consequence of something intrinsic to the object, but you forget that you are part of this system, too, and your own relationship to the object requires you, too, to build a map of it. You therefore can’t use this as an argument to prove that this difficulty of mapping that object is intrinsic to the object, rather than to the relationship of observation.
For any given phenomenon A, I can make up a language L1 where A corresponds to a primitive element in that language. Therefore, the minimum description length for A is 1 in L1. Now imagine another language, L2, for which A has a long description length in L2. The invariance theorem for Kolmogorov complexity, which I believe is what you are basing your intuition on, can be misinterpreted as saying that there is some minimal encoding length for a given phenomenon regardless of language. This is not what that theorem is actually saying, though. What it does say is that the difficulty of encoding phenomenon A in L2 is at most equal to the difficulty of encoding A in L1 and then encoding L1 in L2. In other words, given that A has a minimum description length of 1 in L1, but a very long description length in L2, we can be certain that L1 also has a long description length in L2. In terms of conceptual distance, all the invariance theorem says is that if L1 is close to A, then it must be far from L2, because L2 is far from A. It’s just the triangle inequality, in another guise. (Admittedly, conceptual distance does not have an important property we typically expect of distance measures, that the distance from A to B is the same as the distance from B to A, but that is irrelevant here.)
You can’t get to the outside. No matter what perspective you are indirectly looking from, you are still ultimately looking from your own perspective. (True objectivity is an illusion—it amounts to you imagining you have stepped outside of yourself.) This means that, for any given phenomenon you observe, you are going to have to encode that phenomenon into your own internal modeling language first to understand it, and you will therefore perceive some lower bound on complexity for the expression of that phenomenon. But that complexity, while it seems intrinsic to the phenomenon, is in fact intrinsic to your relationship to the phenomenon, and your ability to encode it into your own internal modeling language. It’s a magic trick played on us by our own cognitive limitations.
I think my objection stands regardless of whether there is one subjective reality or one objective reality. The important aspect of my objection is the “oneness”, not the objectivity, I believe. Earlier, you said:
depending on which primitives and rules we have selected… Occam’s razor will suggest different models… Minimizing complexity in each modeling language lends a different bias toward certain models and against other models, but those biases can be varied or even reversed by changing the language that was selected. There is consequently nothing mathematically special about simplicity that lends an increased probability of correctness to simpler models.
But since we are already, inevitably, embedded within a certain subjective modelling language, we are already committed to the strengths and weaknesses of that language. The further away from our primitives we get, the worse a compromise we end up making, since some of the ways in which we diverge from our primitives will be “wrong”, making sacrifices that do not pay off. The best we can do is break even, therefore the walk away from our primitives that we take is a biasedly random one, and will drift towards worse results.
There might also be a sense in which the worst we can do is break even, but I’m pretty sure that way madness lies. Defining yourself to be correct doesn’t count for correctness, in my book of arbitrary values. Less subjective argument for this view of values: Insofar as primitives are difficult to change, when you think you’ve changed a primitive it’s somewhat likely that what you’ve actually done is increased your internal inconsistency (and coincidentally, thus violated the axioms of NFL).
Whether you call the primitives “objective” or “subjective” is besides the point. What’s important is that they’re there at all.
You should read up on regularization) and the no free lunch theorem, if you aren’t already familiar with them.
A theory is a model for a class of observable phenomena. A model is constructed from smaller primitive (atomic) elements connected together according to certain rules. (Ideally, the model’s behavior or structure is isomorphic to that of the class of phenomena it is intended to represent.) We can take this collection of primitive elements, plus the rules for how they can be connected, as a modeling language. Now, depending on which primitives and rules we have selected, it may become more or less difficult to express a model with behavior isomorphic to the original, requiring more or fewer primitive elements. This means that Occam’s razor will suggest different models as the simplest alternatives depending on which modeling language we have selected. Minimizing complexity in each modeling language lends a different bias toward certain models and against other models, but those biases can be varied or even reversed by changing the language that was selected. There is consequently nothing mathematically special about simplicity that lends an increased probability of correctness to simpler models.
That said, there are valid reasons to use Occam’s razor nonetheless, and not just the reasons the author of this essay lists, such as resource constraint optimization. In fact, it is reasonable to expect that using Occam’s razor does increase the probability of correctness, but not for the reasons that simplicity alone is good. Consider the fact that human beings evolved in this environment, and that our minds are therefore tailored by evolution to be good at identifying patterns that are common within it. In other words, the modeling language used for human cognition has been optimized to some degree to easily express patterns that are observable in our environment. Thus, for the specific pairing of the human environment with the modeling language used by human minds, a bias towards simpler models probably is indicative of an increased likelihood of that model being appropriate to the observed class of phenomena, despite simplicity being irrelevant in the general case of any arbitrary pairing of environment and modeling language.
You’re speaking as though complexity is measuring the relationship between a language and the phenomena, or the map and a territory. But I’m pretty sure complexity is actually an objective and language-independent idea, represented in its pure form in Salmonoff Induction. Complexity is a property that’s observed in the world via senses or data input mechanisms, not just something within the mind. The ease of expressing a certain statement might change depending on the language you’re using, but the statement’s absolute complexity remains the same no matter what. You don’t have to measure everything within the terms of one particular language, you can go outside the particulars and generalize.
You can’t get to the outside. No matter what perspective you are indirectly looking from, you are still ultimately looking from your own perspective. (True objectivity is an illusion—it amounts to you imagining you have stepped outside of yourself.) This means that, for any given phenomenon you observe, you are going to have to encode that phenomenon into your own internal modeling language first to understand it, and you will therefore perceive some lower bound on complexity for the expression of that phenomenon. But that complexity, while it seems intrinsic to the phenomenon, is in fact intrinsic to your relationship to the phenomenon, and your ability to encode it into your own internal modeling language. It’s a magic trick played on us by our own cognitive limitations.
Senses and data input mechanisms are relationships. The observer and the object are related by the act of observation. You are looking at two systems, the observer and the object, and claiming that the observer’s difficulty in building a map of the object is a consequence of something intrinsic to the object, but you forget that you are part of this system, too, and your own relationship to the object requires you, too, to build a map of it. You therefore can’t use this as an argument to prove that this difficulty of mapping that object is intrinsic to the object, rather than to the relationship of observation.
For any given phenomenon A, I can make up a language L1 where A corresponds to a primitive element in that language. Therefore, the minimum description length for A is 1 in L1. Now imagine another language, L2, for which A has a long description length in L2. The invariance theorem for Kolmogorov complexity, which I believe is what you are basing your intuition on, can be misinterpreted as saying that there is some minimal encoding length for a given phenomenon regardless of language. This is not what that theorem is actually saying, though. What it does say is that the difficulty of encoding phenomenon A in L2 is at most equal to the difficulty of encoding A in L1 and then encoding L1 in L2. In other words, given that A has a minimum description length of 1 in L1, but a very long description length in L2, we can be certain that L1 also has a long description length in L2. In terms of conceptual distance, all the invariance theorem says is that if L1 is close to A, then it must be far from L2, because L2 is far from A. It’s just the triangle inequality, in another guise. (Admittedly, conceptual distance does not have an important property we typically expect of distance measures, that the distance from A to B is the same as the distance from B to A, but that is irrelevant here.)
I think my objection stands regardless of whether there is one subjective reality or one objective reality. The important aspect of my objection is the “oneness”, not the objectivity, I believe. Earlier, you said:
But since we are already, inevitably, embedded within a certain subjective modelling language, we are already committed to the strengths and weaknesses of that language. The further away from our primitives we get, the worse a compromise we end up making, since some of the ways in which we diverge from our primitives will be “wrong”, making sacrifices that do not pay off. The best we can do is break even, therefore the walk away from our primitives that we take is a biasedly random one, and will drift towards worse results.
There might also be a sense in which the worst we can do is break even, but I’m pretty sure that way madness lies. Defining yourself to be correct doesn’t count for correctness, in my book of arbitrary values. Less subjective argument for this view of values: Insofar as primitives are difficult to change, when you think you’ve changed a primitive it’s somewhat likely that what you’ve actually done is increased your internal inconsistency (and coincidentally, thus violated the axioms of NFL).
Whether you call the primitives “objective” or “subjective” is besides the point. What’s important is that they’re there at all.