Not sure where to post this so I might as well post it here:
I think the current natural abstractions theory lacks a concept of magnitude that applies across inhomogenous, qualitatively distinct things. You might think it’s no big deal to not have it inherent in the theory, because it can be derived empirically, so since you make better predictions when you assume energy etc. is conserved, you will learn the concept anyway.
Yeah, this is an open problem that’s on my radar. I currently have two main potential threads on it.
First thread: treat each bit in the representation of quantities as distinct random variables, so that e.g. the higher-order and lower-order bits are separate. Then presumably there will often be good approximate natural latents (and higher-level abstract structures) over the higher-order bits, moreso than the lower-order bits. I would say this is the most obvious starting point, but it also has a major drawback: “bits” of a binary number representation are an extremely artificial ontological choice for purposes of this problem. I’d strongly prefer an approach in which magnitudes drop out more naturally.
Thus the second thread: maxent. It continues to seem like there’s probably a natural way to view natural latents in a maxent form, which would involve numerically-valued natural “features” that get added together. That would provide a much less artificial notion of magnitude. However, it requires figuring out the maxent thing for natural latents, which I’ve tried and failed at several times now (though with progress each time).
First thread: treat each bit in the representation of quantities as distinct random variables, so that e.g. the higher-order and lower-order bits are separate. Then presumably there will often be good approximate natural latents (and higher-level abstract structures) over the higher-order bits, moreso than the lower-order bits. I would say this is the most obvious starting point, but it also has a major drawback: “bits” of a binary number representation are an extremely artificial ontological choice for purposes of this problem. I’d strongly prefer an approach in which magnitudes drop out more naturally.
Is the idea here to try to find a way to bias the representation towards higher-order bits than lower-order bits in a variable? I don’t think this is necessary, because it seems like you would get it “for free” due to the fact that lower-order bits usually aren’t predictable without the higher-order bits.
The issue I’m talking about is that we don’t want a bias towards higher-order bits, we want a bias towards magnitude. As in, if there’s 100s of dynamics that can be predicted about something that’s going on at the scale of 1 kJ or 1 gram or 1 cm, that’s about 1/10th as important as if there’s 1 dynamic that can be predicted about something that’s going on at the scale of 1000 MJ or 1 kg or 10 m.
(Obviously on the agency side of things, we have a lot of concepts that allow us to make sense of this, but they all require a representation of the magnitudes, so if the epistemics don’t contain some bias towards magnitudes, the agents might mostly “miss” this.)
Thus the second thread: maxent. It continues to seem like there’s probably a natural way to view natural latents in a maxent form, which would involve numerically-valued natural “features” that get added together. That would provide a much less artificial notion of magnitude. However, it requires figuring out the maxent thing for natural latents, which I’ve tried and failed at several times now (though with progress each time).
Hmmm maybe.
You mean like, the macrostate k is defined by an equation like, the highest-entropy distribution of microstates that satisfies E[f(microstate)]=k? My immediate skepticism would be that this is still defining the magnitudes epistemically (based on the probabilities in the expectation), whereas I suspect they would have to be based on something like a conservation law or diffusion process, but let’s take some more careful thought:
It seems like we’d generally not expect to be able to directly observe a microstate. So we’d really use something like E[f(adjacentmacrostate)|condition]=k; for instance the classical case is putting a thermometer to an object, or putting an object on a scale.
But for most natural features g(microstate), your uncertainty about the macrostates would (according to the central thesis of my sequence) be ~lognormally distributed. Since a lognormal distribution is maxent based on log(g) and log(g)2, this means that a maximally-informative f would be something like f(adjacentmacrostate)≈E[(log(g(microstate)),log(g(microstate))2)|adjacentmacrostate].
And, because the scaling of f is decided by the probabilities, its scaling is only defined up to a multiplicative factor p, which means g is only defined up to a power, such that g(...)p would be as natural as g(...).
Which undermines the possibility of addition, because ∑ixpi≠(∑ixi)p.
As a side-note, a slogan I’ve found which communicates the relevant intuition is “information is logarithmic”. I like to imagine that the “ideal” information-theoretic encoder is a function h such that h(v⊗w)=h(v)⊕h(w) (converting tensor products to direct sums). Of course, this is kind of underdefined, doesn’t even typecheck, and can easily be used to derive contradictions; but I find it gives the right intuition in a lot of places, so I expect to eventually find a cleaner way to express it.
In particular I think this would yield the possibility of talking about “fuzzier” concepts which lack the determinism/predictability that physical objects have. In order for the fuzzier concepts to matter, they still need to have a commensurate amount of “magnitude” to the physical objects.
Not sure where to post this so I might as well post it here:
I think the current natural abstractions theory lacks a concept of magnitude that applies across inhomogenous, qualitatively distinct things. You might think it’s no big deal to not have it inherent in the theory, because it can be derived empirically, so since you make better predictions when you assume energy etc. is conserved, you will learn the concept anyway.
The issue is, there’s a shitton of things that give you better predictions, so if you are not explicitly biased towards magnitude, you might learn it very “late” in the ordering of things you learn.
Conversely, for alignment and interpretability purposes, we want to learn it very early and treat it quite fundamentally because it gives meaning to root cause analyses, which is what allows us to slice the world into a smallish number of discrete objects.
Yeah, this is an open problem that’s on my radar. I currently have two main potential threads on it.
First thread: treat each bit in the representation of quantities as distinct random variables, so that e.g. the higher-order and lower-order bits are separate. Then presumably there will often be good approximate natural latents (and higher-level abstract structures) over the higher-order bits, moreso than the lower-order bits. I would say this is the most obvious starting point, but it also has a major drawback: “bits” of a binary number representation are an extremely artificial ontological choice for purposes of this problem. I’d strongly prefer an approach in which magnitudes drop out more naturally.
Thus the second thread: maxent. It continues to seem like there’s probably a natural way to view natural latents in a maxent form, which would involve numerically-valued natural “features” that get added together. That would provide a much less artificial notion of magnitude. However, it requires figuring out the maxent thing for natural latents, which I’ve tried and failed at several times now (though with progress each time).
Is the idea here to try to find a way to bias the representation towards higher-order bits than lower-order bits in a variable? I don’t think this is necessary, because it seems like you would get it “for free” due to the fact that lower-order bits usually aren’t predictable without the higher-order bits.
The issue I’m talking about is that we don’t want a bias towards higher-order bits, we want a bias towards magnitude. As in, if there’s 100s of dynamics that can be predicted about something that’s going on at the scale of 1 kJ or 1 gram or 1 cm, that’s about 1/10th as important as if there’s 1 dynamic that can be predicted about something that’s going on at the scale of 1000 MJ or 1 kg or 10 m.
(Obviously on the agency side of things, we have a lot of concepts that allow us to make sense of this, but they all require a representation of the magnitudes, so if the epistemics don’t contain some bias towards magnitudes, the agents might mostly “miss” this.)
Hmmm maybe.
You mean like, the macrostate k is defined by an equation like, the highest-entropy distribution of microstates that satisfies E[f(microstate)]=k? My immediate skepticism would be that this is still defining the magnitudes epistemically (based on the probabilities in the expectation), whereas I suspect they would have to be based on something like a conservation law or diffusion process, but let’s take some more careful thought:
It seems like we’d generally not expect to be able to directly observe a microstate. So we’d really use something like E[f(adjacentmacrostate)|condition]=k; for instance the classical case is putting a thermometer to an object, or putting an object on a scale.
But for most natural features g(microstate), your uncertainty about the macrostates would (according to the central thesis of my sequence) be ~lognormally distributed. Since a lognormal distribution is maxent based on log(g) and log(g)2, this means that a maximally-informative f would be something like f(adjacentmacrostate)≈E[(log(g(microstate)),log(g(microstate))2)|adjacentmacrostate].
And, because the scaling of f is decided by the probabilities, its scaling is only defined up to a multiplicative factor p, which means g is only defined up to a power, such that g(...)p would be as natural as g(...).
Which undermines the possibility of addition, because ∑ixpi≠(∑ixi)p.
As a side-note, a slogan I’ve found which communicates the relevant intuition is “information is logarithmic”. I like to imagine that the “ideal” information-theoretic encoder is a function h such that h(v⊗w)=h(v)⊕h(w) (converting tensor products to direct sums). Of course, this is kind of underdefined, doesn’t even typecheck, and can easily be used to derive contradictions; but I find it gives the right intuition in a lot of places, so I expect to eventually find a cleaner way to express it.
In particular I think this would yield the possibility of talking about “fuzzier” concepts which lack the determinism/predictability that physical objects have. In order for the fuzzier concepts to matter, they still need to have a commensurate amount of “magnitude” to the physical objects.