K-complexity is apparently sometimes called “algorithmic entropy” (but not just “entropy”, I don’t think?)
Wiktionary quotes Niels Henrik Gregersen:
Algorithmic entropy is closely related to statistically defined entropy, the statistical entropy of an ensemble being, for any concisely describable ensemble, very nearly equal to the ensemble average of the algorithmic entropy of its members
I think this might be the crux!
Note the weird type mismatch: “the statistical entropy of an ensemble [...] the ensemble average of the algorithmic entropy of its members”.
So my story would be something like the following:
Many fields (thermodynamics, statistical mechanics, information theory, probability) use “entropy” to mean something equivalent to “the expectation of -log(p) for a distribution p”. Let’s call this “statistical entropy”, but in practice people call it “entropy”.
Algorithmic information theorists have an interestingly related but distinct concept, which they sometimes call “algorithmic entropy”.
Whoops, hang on a sec. Did you want your “abstract entropy” to encompass both of these?
If so, I didn’t realize that until now! That changes a lot, and I apologize sincerely if waiting for the K-complexity stuff would’ve dissipated a lot of the confusion.
Things I think contributed to my confusion:
(1) Your introduction only directly mentions / links to domain-specific types of entropy that are firmly under (type 1) “statistical entropy”
(2) This intro post doesn’t yet touch on (type 2) algorithmic entropy, and is instead a mix of type-1 and your abstract thing where description length and probability distribution are decoupled.
(3) I suspect you were misled by the unpedagogical phrase “entropy of a macrostate” from statmech, and didn’t realize that (as used in that field) the distribution involved is determined by the macrostate in a prescribed way (or is the macrostate).
I would add a big fat disclaimer that this series is NOT just limited to type-1 entropy, and (unless you disagree with my taxonomy here) emphasize heavily that you’re including type-2 entropy.
Did you want your “abstract entropy” to encompass both of these?
Indeed I definitely do.
I would add a big fat disclaimer
There are a bunch of places where I think I flagged relevant things, and I’m curious if these seem like enough to you;
The whole post is called “abstract entropy”, which should tell you that it’s at least a little different from any “standard” form of entropy
The third example, “It helps us understand strategies for (and limits on) file compression”, is implicitly about K-complexity
This whole paragraph: “Many people reading this will have some previous facts about entropy stored in their minds, and this can sometimes be disorienting when it’s not yet clear how those facts are consistent with what I’m describing. You’re welcome to skip ahead to the relevant parts and see if they’re re-orienting; otherwise, if you can get through the whole explanation, I hope that it will eventually be addressed!”
Me being clear that I’m not a domain expert
Footnote [4], which talks about Turing machines and links to my post on Solomonoff induction
Me going on and on about binary strings and how we’re associating these with individual state—I dunno, to me this just screams K-complexity to anyone who’s heard of it
“I just defined entropy as a property of specific states, but in many contexts you don’t care at all about specific states...”
… “I’ll talk about this in a future post; I think that “order” is synonymous with Kolmogorov complexity.” …
I struggled with writing the intro section of this post because it felt like there were half a dozen disclaimer-type things that I wanted to get out of the way first. But each one is only relevant to a subset of people, and eventually I need to get to the content. I’m not even expecting most readers to be holding any such type-1/type-2 distinction in their mind to start, so I’d have to go out of my way to explain it before giving the disclaimer.
All that aside, I am very open to the idea that we should be calling the single-state thing something different. The “minimum average” form is the great majority of use cases.
I initially interpreted “abstract entropy” as meaning statistical entropy as opposed to thermodynamic or stat-mech or information-theoretic entropy. I think very few people encounter the phrase “algorithmic entropy” enough for it to be salient to them, so most confusion about entropy in different domains is about statistical entropy in physics and info theory. (Maybe this is different for LW readers!)
This was reinforced by the introduction because I took the mentions of file compression and assigning binary strings to states to be about (Shannon-style) coding theory, which uses statistical entropy heavily to talk about these same things and is a much bigger part of most CS textbooks/courses. (It uses phrases like “length of a codeword”, “expected length of a code [under some distribution]”, etc. and then has lots of theorems about statistical entropy being related to expected length of an optimal code.)
After getting that pattern going, I had enough momentum to see “Solomonoff”, think “sure, it’s a probability distribution, presumably he’s going to do something statistical-entropy-like with it”, and completely missed the statements that you were going to be interpreting K complexity itself as a kind of entropy. I also missed the statement about random variables not being necessary.
I suspect this would also happen to many other people who have encountered stat mech and/or information theory, and maybe even K complexity but not the phrase “algorithmic entropy”, but I could be wrong.
A disclaimer is probably not actually necessary, though, on reflection; I care a lot more about the “minimum average” qualifiers both being included in statistical-entropy contexts. I don’t know exactly how to unify this with “algorithmic entropy” but I’ll wait and see what you do :)
K-complexity is apparently sometimes called “algorithmic entropy” (but not just “entropy”, I don’t think?)
Wiktionary quotes Niels Henrik Gregersen:
I think this might be the crux!
Note the weird type mismatch: “the statistical entropy of an ensemble [...] the ensemble average of the algorithmic entropy of its members”.
So my story would be something like the following:
Many fields (thermodynamics, statistical mechanics, information theory, probability) use “entropy” to mean something equivalent to “the expectation of -log(p) for a distribution p”. Let’s call this “statistical entropy”, but in practice people call it “entropy”.
Algorithmic information theorists have an interestingly related but distinct concept, which they sometimes call “algorithmic entropy”.
Whoops, hang on a sec. Did you want your “abstract entropy” to encompass both of these?
If so, I didn’t realize that until now! That changes a lot, and I apologize sincerely if waiting for the K-complexity stuff would’ve dissipated a lot of the confusion.
Things I think contributed to my confusion:
(1) Your introduction only directly mentions / links to domain-specific types of entropy that are firmly under (type 1) “statistical entropy”
(2) This intro post doesn’t yet touch on (type 2) algorithmic entropy, and is instead a mix of type-1 and your abstract thing where description length and probability distribution are decoupled.
(3) I suspect you were misled by the unpedagogical phrase “entropy of a macrostate” from statmech, and didn’t realize that (as used in that field) the distribution involved is determined by the macrostate in a prescribed way (or is the macrostate).
I would add a big fat disclaimer that this series is NOT just limited to type-1 entropy, and (unless you disagree with my taxonomy here) emphasize heavily that you’re including type-2 entropy.
Indeed I definitely do.
There are a bunch of places where I think I flagged relevant things, and I’m curious if these seem like enough to you;
The whole post is called “abstract entropy”, which should tell you that it’s at least a little different from any “standard” form of entropy
The third example, “It helps us understand strategies for (and limits on) file compression”, is implicitly about K-complexity
This whole paragraph: “Many people reading this will have some previous facts about entropy stored in their minds, and this can sometimes be disorienting when it’s not yet clear how those facts are consistent with what I’m describing. You’re welcome to skip ahead to the relevant parts and see if they’re re-orienting; otherwise, if you can get through the whole explanation, I hope that it will eventually be addressed!”
Me being clear that I’m not a domain expert
Footnote [4], which talks about Turing machines and links to my post on Solomonoff induction
Me going on and on about binary strings and how we’re associating these with individual state—I dunno, to me this just screams K-complexity to anyone who’s heard of it
“I just defined entropy as a property of specific states, but in many contexts you don’t care at all about specific states...”
… “I’ll talk about this in a future post; I think that “order” is synonymous with Kolmogorov complexity.” …
I struggled with writing the intro section of this post because it felt like there were half a dozen disclaimer-type things that I wanted to get out of the way first. But each one is only relevant to a subset of people, and eventually I need to get to the content. I’m not even expecting most readers to be holding any such type-1/type-2 distinction in their mind to start, so I’d have to go out of my way to explain it before giving the disclaimer.
All that aside, I am very open to the idea that we should be calling the single-state thing something different. The “minimum average” form is the great majority of use cases.
I initially interpreted “abstract entropy” as meaning statistical entropy as opposed to thermodynamic or stat-mech or information-theoretic entropy. I think very few people encounter the phrase “algorithmic entropy” enough for it to be salient to them, so most confusion about entropy in different domains is about statistical entropy in physics and info theory. (Maybe this is different for LW readers!)
This was reinforced by the introduction because I took the mentions of file compression and assigning binary strings to states to be about (Shannon-style) coding theory, which uses statistical entropy heavily to talk about these same things and is a much bigger part of most CS textbooks/courses. (It uses phrases like “length of a codeword”, “expected length of a code [under some distribution]”, etc. and then has lots of theorems about statistical entropy being related to expected length of an optimal code.)
After getting that pattern going, I had enough momentum to see “Solomonoff”, think “sure, it’s a probability distribution, presumably he’s going to do something statistical-entropy-like with it”, and completely missed the statements that you were going to be interpreting K complexity itself as a kind of entropy. I also missed the statement about random variables not being necessary.
I suspect this would also happen to many other people who have encountered stat mech and/or information theory, and maybe even K complexity but not the phrase “algorithmic entropy”, but I could be wrong.
A disclaimer is probably not actually necessary, though, on reflection; I care a lot more about the “minimum average” qualifiers both being included in statistical-entropy contexts. I don’t know exactly how to unify this with “algorithmic entropy” but I’ll wait and see what you do :)