Unconfident. I don’t yet have a firm philosophical grounding on abstractions.
Unequally Made Abstractions
Not all natural abstractions are made equal. It seems to me that there are different degrees of “naturalness” of an abstraction. When we say a concept is a natural abstraction, we are saying things like:
The concept is highly privileged by the inductive biases of most learning algorithms that can efficiently learn our universe[1]
The more privileged the concept is in aggregate[2], the more natural the abstraction is
Learning algorithms given weight inversely proportional to how efficiently they learn our universe
The more learning algorithms privilege the concept, the more natural an abstraction it is
Most efficient representations of our universe contain a simple embedding of the subset
[The simpler the embeddings of the concept are]/[the easier it is to point to embeddings of the concept] in aggregate[3], the more natural the abstraction is
It follows that the most natural abstractions for a concept (cluster) are the abstractions that we’d expect AI systems to actually use/the abstractions that are most relevant for their decision making.
What’s The Most Natural Abstraction for Human Values?
Consider the subset of “human values” that we’d be “happy” (were we fully informed) for powerful systems to optimise for.
[Weaker version: “the subset of human values that it is existentially safe for powerful systems to optimise for”.]
Let’s call this subset “ideal values”.
I’d guess that the “most natural” abstraction of values isn’t “ideal values” themselves but something like “the minimal latents of ideal values”.
Conclusions
I don’t think the multitude of abstractions of human values is necessarily as big a stumbling block as Steiner posited.
“Learning our universe” means learning a map/world model of our universe that allows effectively predicting future events.
The constraint of efficiency suggests that the learned world model (and inferences via it) should have low data, time, and space complexity (relative to what is attainable for optimal learning algorithms).
When aggregating across learning algorithms we might want to give algorithms weights that are inversely proportional to how efficient they are in learning our universe.
When aggregating across representations of our universe we might want to give representations weights that are inversely proportional to how efficient they represent our universe.
Contra Steiner on Too Many Natural Abstractions
Reply To: “Take 4: One problem with natural abstractions is there’s too many of them.”
Epistemic Status
Unconfident. I don’t yet have a firm philosophical grounding on abstractions.
Unequally Made Abstractions
Not all natural abstractions are made equal. It seems to me that there are different degrees of “naturalness” of an abstraction. When we say a concept is a natural abstraction, we are saying things like:
The concept is highly privileged by the inductive biases of most learning algorithms that can efficiently learn our universe[1]
The more privileged the concept is in aggregate[2], the more natural the abstraction is
Learning algorithms given weight inversely proportional to how efficiently they learn our universe
The more learning algorithms privilege the concept, the more natural an abstraction it is
Most efficient representations of our universe contain a simple embedding of the subset
[The simpler the embeddings of the concept are]/[the easier it is to point to embeddings of the concept] in aggregate[3], the more natural the abstraction is
It follows that the most natural abstractions for a concept (cluster) are the abstractions that we’d expect AI systems to actually use/the abstractions that are most relevant for their decision making.
What’s The Most Natural Abstraction for Human Values?
Consider the subset of “human values” that we’d be “happy” (were we fully informed) for powerful systems to optimise for.
[Weaker version: “the subset of human values that it is existentially safe for powerful systems to optimise for”.]
Let’s call this subset “ideal values”.
I’d guess that the “most natural” abstraction of values isn’t “ideal values” themselves but something like “the minimal latents of ideal values”.
Conclusions
I don’t think the multitude of abstractions of human values is necessarily as big a stumbling block as Steiner posited.
“Learning our universe” means learning a map/world model of our universe that allows effectively predicting future events.
The constraint of efficiency suggests that the learned world model (and inferences via it) should have low data, time, and space complexity (relative to what is attainable for optimal learning algorithms).
When aggregating across learning algorithms we might want to give algorithms weights that are inversely proportional to how efficient they are in learning our universe.
When aggregating across representations of our universe we might want to give representations weights that are inversely proportional to how efficient they represent our universe.