This touches on some issues I’d wanted to discuss: abstraction hierarchies, and incompatible abstraction layers.
So, here’s a new conditional independence condition for “large” systems, i.e. systems with an infinite number of Xi’s: given Λ, any finite subset of the Xi’s must be approximately independent (i.e. mutual information below some small ϵ) of all but a finite number of the other Xi’s
Suppose we have a number of tree-instances X1,X2,...,Xn. Given a sufficiently large ϵ, we can compute a valid “general tree abstraction”. But what if we’ve picked a lower ϵ, and are really committed to keeping it low, for some reason?
Here’s a trick:
We separate tree-instances into sets S1,S2,...,Sm such that we can compute the corresponding “first-order” abstractions Λ1,Λ2,...,Λm over each set, and they would be valid, in the sense that any two Xi,Xj∈Sk would have mutual information below ϵ when conditioned on Λk[1]. Plausibly, that would recover a set of abstractions corresponding to “tree species”.
Then we repeat the trick: split the first-order abstractions Λ1,Λ2,...,Λm into sets, and generate second-order abstractions ΛII1,ΛII2,...,ΛIIq. That may recover, say, genuses.
We do this iteratively until getting a single nth-order abstraction ΛΩ, standing-in for “all trees”.
I think it would all have sensible behavior. Conditioning any given tree-instance Xi on ΛΩ would only explain general facts about the trees, as we wanted. Conditioning on the appropriate lower-level abstractions would explain progressively more information about Xi. Conditioning a Xi∉Sj on ΛIj, in turn, would turn up some information that’s in excess, or make some wrong predictions, but get the general facts right. (And you can also condition first-order abstractions on higher-order abstractions, etc.)
The question is: how do we pick ϵ? One potential answer is that, given some set of instances X1,X2,...,Xn, we always try for the lowestϵ possible[1]. Perhaps that’s the mathematical description of taxonomy, even? “Given a set of instances, generate the abstraction hierarchy that minimizes ϵ at each abstraction-level.”
There’s a different way to go about it, though. Suppose that, instead of picking ϵ and then deciding on groupings, we first split instances X1,X2,...,Xn into sets, according to some rule? We have to be able to do that: we’ve somehow decided to abstract over these specific X1,X2,...,Xn to begin with, so we already have some way to generate groupings. (We’ve somehow arrived at a set of tree-instances to abstract over, instead of a mixture of cars, trees, towels, random objects...)
So, we pick some “rule”, which is likely a natural abstraction in itself, or defined over one. Like “trees that are N years old” with separate set for every N, or “this tree has leaves” y/n, or “trees in %person%’s backyard” for every %person%. Then we split the instances into sets according to that rule, and try to summarize every set.
Important: that way, we may get meaningfully different ϵs for every set! For example, suppose we cluster trees by whose backyard they’re in.
Person A has trees of several different species growing in their yard. For them, we compute SA, the corresponding abstraction/summary ΦA[2], and some ϵA that makes ΦA be a valid abstraction.
Person B only plants trees of a single species. Again, we compute SB, ΦB, ϵB.
Obviously, ϵA>ϵB.
What does this approach yield us?
It’s a tool of analysis. We can try different rules on for size, and see if that reveals any interesting data. (Do most people grow only trees of a single species in their yard?)
It’s potentially useful for general-purpose search via constraints. Consider two different first-order abstractions, “trees of species z” Λz and trees-in-my-backyard Φmy. Computing the second-order abstraction from them would be rather arbitrary, but it’s something we may want to do during a specific planning process!
(Though note that combining any two nth-order abstractions would result in a (n+1)th-order abstraction that has at least as much information as ΛΩ. I. e., any given valid abstraction hierarchy over a given set of instances terminates in the same max-level abstraction. I’m not sure if that’s useful.)
It allows abstraction layers, as outlined below.
Consider humans, geopolitical entities, and ideological movements. They don’t have a clear hierarchy: while humans are what constitutes the latter two “layers”, ideological movements are not split across geopolitical lines (same ideologies can be present in different countries), and geopolitical entities are not split along ideological lines (a given government can have multiple competing ideologies). By implication, once you’re viewing the world in terms of ideologies, you can’t recover governments from this data; nor vice versa.
Similarly: As we’ve established, we can split trees by species ΛI1,...,ΛIn and by “whose backyard they’re in” ΦI1,...,ΦIg. But: we would not be able to recover genuses ΛII1,...,ΛIId from the backyard-data ΦI1,...,ΦIg! Once we’ve committed to the backyard-classification, we’ve closed-off species-classification!
I propose calling such incompatible abstraction hierarchies abstraction layers. Behind every abstraction layer, there’s some rule by which we’re splitting instances into sets, and such rules are/are-defined-over natural abstractions, in turn.
And, I guess, such that there’s at least one set with more than one instance, to forbid the uninteresting trivial case where there’s a one-member set for every initial instance. More generally, we’d want the number of sets to be “small” compared to the number of instances, in some sense of “small”.
This touches on some issues I’d wanted to discuss: abstraction hierarchies, and incompatible abstraction layers.
Suppose we have a number of tree-instances X1,X2,...,Xn. Given a sufficiently large ϵ, we can compute a valid “general tree abstraction”. But what if we’ve picked a lower ϵ, and are really committed to keeping it low, for some reason?
Here’s a trick:
We separate tree-instances into sets S1,S2,...,Sm such that we can compute the corresponding “first-order” abstractions Λ1,Λ2,...,Λm over each set, and they would be valid, in the sense that any two Xi,Xj∈Sk would have mutual information below ϵ when conditioned on Λk[1]. Plausibly, that would recover a set of abstractions corresponding to “tree species”.
Then we repeat the trick: split the first-order abstractions Λ1,Λ2,...,Λm into sets, and generate second-order abstractions ΛII1,ΛII2,...,ΛIIq. That may recover, say, genuses.
We do this iteratively until getting a single nth-order abstraction ΛΩ, standing-in for “all trees”.
I think it would all have sensible behavior. Conditioning any given tree-instance Xi on ΛΩ would only explain general facts about the trees, as we wanted. Conditioning on the appropriate lower-level abstractions would explain progressively more information about Xi. Conditioning a Xi∉Sj on ΛIj, in turn, would turn up some information that’s in excess, or make some wrong predictions, but get the general facts right. (And you can also condition first-order abstractions on higher-order abstractions, etc.)
The question is: how do we pick ϵ? One potential answer is that, given some set of instances X1,X2,...,Xn, we always try for the lowest ϵ possible[1]. Perhaps that’s the mathematical description of taxonomy, even? “Given a set of instances, generate the abstraction hierarchy that minimizes ϵ at each abstraction-level.”
There’s a different way to go about it, though. Suppose that, instead of picking ϵ and then deciding on groupings, we first split instances X1,X2,...,Xn into sets, according to some rule? We have to be able to do that: we’ve somehow decided to abstract over these specific X1,X2,...,Xn to begin with, so we already have some way to generate groupings. (We’ve somehow arrived at a set of tree-instances to abstract over, instead of a mixture of cars, trees, towels, random objects...)
So, we pick some “rule”, which is likely a natural abstraction in itself, or defined over one. Like “trees that are N years old” with separate set for every N, or “this tree has leaves” y/n, or “trees in %person%’s backyard” for every %person%. Then we split the instances into sets according to that rule, and try to summarize every set.
Important: that way, we may get meaningfully different ϵs for every set! For example, suppose we cluster trees by whose backyard they’re in.
Person A has trees of several different species growing in their yard. For them, we compute SA, the corresponding abstraction/summary ΦA[2], and some ϵA that makes ΦA be a valid abstraction.
Person B only plants trees of a single species. Again, we compute SB, ΦB, ϵB.
Obviously, ϵA>ϵB.
What does this approach yield us?
It’s a tool of analysis. We can try different rules on for size, and see if that reveals any interesting data. (Do most people grow only trees of a single species in their yard?)
It’s potentially useful for general-purpose search via constraints. Consider two different first-order abstractions, “trees of species z” Λz and trees-in-my-backyard Φmy. Computing the second-order abstraction from them would be rather arbitrary, but it’s something we may want to do during a specific planning process!
(Though note that combining any two nth-order abstractions would result in a (n+1)th-order abstraction that has at least as much information as ΛΩ. I. e., any given valid abstraction hierarchy over a given set of instances terminates in the same max-level abstraction. I’m not sure if that’s useful.)
It allows abstraction layers, as outlined below.
Consider humans, geopolitical entities, and ideological movements. They don’t have a clear hierarchy: while humans are what constitutes the latter two “layers”, ideological movements are not split across geopolitical lines (same ideologies can be present in different countries), and geopolitical entities are not split along ideological lines (a given government can have multiple competing ideologies). By implication, once you’re viewing the world in terms of ideologies, you can’t recover governments from this data; nor vice versa.
Similarly: As we’ve established, we can split trees by species ΛI1,...,ΛIn and by “whose backyard they’re in” ΦI1,...,ΦIg. But: we would not be able to recover genuses ΛII1,...,ΛIId from the backyard-data ΦI1,...,ΦIg! Once we’ve committed to the backyard-classification, we’ve closed-off species-classification!
I propose calling such incompatible abstraction hierarchies abstraction layers. Behind every abstraction layer, there’s some rule by which we’re splitting instances into sets, and such rules are/are-defined-over natural abstractions, in turn.
Does all that make sense, on your model?
And, I guess, such that there’s at least one set with more than one instance, to forbid the uninteresting trivial case where there’s a one-member set for every initial instance. More generally, we’d want the number of sets to be “small” compared to the number of instances, in some sense of “small”.
Reason for the change in notation from Λ will be apparent later.
Or maybe it’s still useful, for general-purpose search via constraints?