A while back, you mentioned that people regularly confuse universal priors with coding theory. But minimum message length is considered a restatement of occam’s razor, just like solomonoff induction; and MML is pretty coding theory-ish. Which parts of coding theory are dangerous to confuse with the universal prior, and what’s the danger?
The difference I was getting at is that when constructing a code you’re taking experiences you’ve already had and then assigning them weight, whereas the universal prior, being a prior, assigns weight to strings without any reference to your experiences. So when people say “the universal prior says that Maxwell’s equations are simple and Zeus is complex”, what they actually mean is that in their experience mathematical descriptions of natural phenomena have proved more fruitful than descriptions that involve agents; the universal prior has nothing to do with this, and invoking it is dangerous as it encourages double-counting of evidence: “this explanation is more probable because it is simpler, and I know it’s simpler because it’s more probable”. When in fact the relationship between simplicity and probability is tautologous, not mutually reinforcing.
This error really bothers me, because aside from its incorrectness it’s using technical mathematics in a surface way as a blunt weapon verbose argument that makes people unfamiliar with the math feel like they’re not getting something that they shouldn’t in fact get nor need to understand.
(I’ve swept the problem of “which prefix do I use?” under the rug because there are no AIT tools to deal with that and so if you want to talk about the problem of prefixes, you should do so separately from invoking AIT for some everyday hermeneutic problem. Generally if you’re invoking AIT for some object-level hermeneutic problem you’re Doing It Wrong, as has been explained most clearly by cousin_it.)
So when people say “the universal prior says that Maxwell’s equations are simple and Zeus is complex”, what they actually mean is that in their experience mathematical descriptions of natural phenomena have proved more fruitful than descriptions that involve agents
I thought it meant that if you taboo “Zeus”, the string length increases more dramatically than when you taboo “Maxwell’s equations”.
Sure, but stil somehow “my grandma” is more complex than “two plus two”, even if the former string has only 10 characters and the latter has 12. So now the question is whether “Zeus” is more like “my grandma” or more like “two plus two”.
So when people say “the universal prior says that Maxwell’s equations are simple and Zeus is complex”, what they actually mean is that in their experience mathematical descriptions of natural phenomena have proved more fruitful than descriptions that involve agents; the universal prior has nothing to do with this, and invoking it is dangerous as it encourages double-counting of evidence
Attempting to work the dependence of my epistemology on my experience into my epistemology itself creates a cycle in the definitions of types, and wrecks the whole thing. I suspect that reformalizing as a fixpoint thing would fix the problem, but I suspect even more strongly that the point I’m already at would be a unique fixpoint and that I’d be wrecking its elegance for the sake of generalizing to hypothetical agents that I’m not and may never encounter. (Or that all such fixpoints can be encoded as prefixes, which I too feel like sweeping under the rug.)
...So, where in this schema does Minimum Message Length fit? Under AIT, or coding theory? Seems like it’d be coding theory, since it relies on your current coding to describe the encoding for the data you’re compressing. But everyone seems to refer to MML as the computable version of Kolmogorov Complexity; and it really does seem fairly equivalent.
It seems to me that KC/SI/AIT explicitly presents the choice of UTM as an unsolved problem, while coding theory and MML implicitly assume that you use your current coding; and that that is the part that gets people into trouble when comparing Zeus and Maxwell. Is that it?
It seems to me that KC/SI/AIT explicitly presents the choice of UTM as an unsolved problem, while coding theory and MML implicitly assume that you use your current coding; and that that is the part that gets people into trouble when comparing Zeus and Maxwell. Is that it?
I think more or less yes, if I understand it. And more seriously, AIT is in some ways meant not to be practical, the interesting results require setting things up so that technically the work is pushed to the “within a constant” part. Which is divorced from praxis. Practical MML intuitions don’t carry over into such extreme domains. That said, the same core intuitions inspire them; there are just other intuitions that emerge depending on what context you’re working in or mathematizing. But this is still conjecture, ’cuz I personally haven’t actually used MML on any project, even if I’m read some results.
A while back, you mentioned that people regularly confuse universal priors with coding theory. But minimum message length is considered a restatement of occam’s razor, just like solomonoff induction; and MML is pretty coding theory-ish. Which parts of coding theory are dangerous to confuse with the universal prior, and what’s the danger?
The difference I was getting at is that when constructing a code you’re taking experiences you’ve already had and then assigning them weight, whereas the universal prior, being a prior, assigns weight to strings without any reference to your experiences. So when people say “the universal prior says that Maxwell’s equations are simple and Zeus is complex”, what they actually mean is that in their experience mathematical descriptions of natural phenomena have proved more fruitful than descriptions that involve agents; the universal prior has nothing to do with this, and invoking it is dangerous as it encourages double-counting of evidence: “this explanation is more probable because it is simpler, and I know it’s simpler because it’s more probable”. When in fact the relationship between simplicity and probability is tautologous, not mutually reinforcing.
This error really bothers me, because aside from its incorrectness it’s using technical mathematics in a surface way as a blunt weapon verbose argument that makes people unfamiliar with the math feel like they’re not getting something that they shouldn’t in fact get nor need to understand.
(I’ve swept the problem of “which prefix do I use?” under the rug because there are no AIT tools to deal with that and so if you want to talk about the problem of prefixes, you should do so separately from invoking AIT for some everyday hermeneutic problem. Generally if you’re invoking AIT for some object-level hermeneutic problem you’re Doing It Wrong, as has been explained most clearly by cousin_it.)
I thought it meant that if you taboo “Zeus”, the string length increases more dramatically than when you taboo “Maxwell’s equations”.
Except that’s not the case. I can make any statement arbitrarily long by continuously forcing you to taboo the words you use.
Sure, but stil somehow “my grandma” is more complex than “two plus two”, even if the former string has only 10 characters and the latter has 12. So now the question is whether “Zeus” is more like “my grandma” or more like “two plus two”.
Attempting to work the dependence of my epistemology on my experience into my epistemology itself creates a cycle in the definitions of types, and wrecks the whole thing. I suspect that reformalizing as a fixpoint thing would fix the problem, but I suspect even more strongly that the point I’m already at would be a unique fixpoint and that I’d be wrecking its elegance for the sake of generalizing to hypothetical agents that I’m not and may never encounter. (Or that all such fixpoints can be encoded as prefixes, which I too feel like sweeping under the rug.)
...So, where in this schema does Minimum Message Length fit? Under AIT, or coding theory? Seems like it’d be coding theory, since it relies on your current coding to describe the encoding for the data you’re compressing. But everyone seems to refer to MML as the computable version of Kolmogorov Complexity; and it really does seem fairly equivalent.
It seems to me that KC/SI/AIT explicitly presents the choice of UTM as an unsolved problem, while coding theory and MML implicitly assume that you use your current coding; and that that is the part that gets people into trouble when comparing Zeus and Maxwell. Is that it?
I think more or less yes, if I understand it. And more seriously, AIT is in some ways meant not to be practical, the interesting results require setting things up so that technically the work is pushed to the “within a constant” part. Which is divorced from praxis. Practical MML intuitions don’t carry over into such extreme domains. That said, the same core intuitions inspire them; there are just other intuitions that emerge depending on what context you’re working in or mathematizing. But this is still conjecture, ’cuz I personally haven’t actually used MML on any project, even if I’m read some results.