This post is styled after conversations we’ve had in the course of our research, put together in a way that hopefully highlights a bunch of relatively recent and (ironically) hard-to-articulate ideas around natural abstractions.
John: So we’ve been working a bit on semantics, and also separately on fluid mechanics. Our main goal for both of them is to figure out more of the higher-level natural abstract data structures. But I’m concerned that the two threads haven’t been informing each other as much as they should.
David: Okay…what do you mean by “as much as they should”? I mean, there’s the foundational natural latent framework, and that’s been useful for our thinking on both semantics and fluid mechanics. But beyond that, concretely, in what ways do (should?) semantics and fluid mechanics inform each other?
John: We should see the same types of higher-level data structures across both—e.g. the “geometry + trajectory” natural latents we used in the semantics post should, insofar as the post correctly captures the relevant concepts, generalize to recognizable “objects” in a fluid flow, like eddies (modulo adjustments for nonrigid objects).
David: Sure, I did think it was intuitive to think along those lines as a model for eddies in fluid flow. But in general, why expect to see the same types of data structures for semantics and fluid flow? Why not expect various phenomena in fluid flow to be more suited to representation in some data structures which aren’t the exact same type as those used for the referrents of human words?
John: Specifically, I claim that the types of high-level data structures which are natural for fluid flow should be a subset of the types needed for semantics. If there’s a type of high-level data structure which is natural for fluid flow, but doesn’t match any of the semantic types (noun, verb, adjective, short phrases constructed from those, etc), then that pretty directly disproves at least one version of the natural abstraction hypothesis (and it’s a version which I currently think is probably true).
David: Woah, hold up, that sounds like a very different form of the natural abstraction hypothesis than our audience has heard before! It almost sounds like you’re saying that there are no “non-linguistic concepts”. But I know you actually think that much/most of human cognition routes through “non-linguistic concepts”.
John: Ok, there’s a couple different subtleties here.
First: there’s the distinction between a word or phrase or sentence vs the concept(s) to which it points. Like, the word “dog” evokes this whole concept in your head, this whole “data structure” so to speak, and that data structure is not itself linguistic. It involves visual concepts, probably some unnamed concepts, things which your “inner simulator” can use, etc. Usually when I say that “most human concepts/cognition are not linguistic”, that’s the main thing I’m pointing to.
Second: there’s concepts for which we don’t yet have names, but could assign names to. One easy way to find examples is to look for words in other languages which don’t have any equivalent in our language. The key point about those concepts is that they’re still the same “types of concepts” which we normally assign words to, i.e. they’re still nouns or adjectives or verbs or…, we just don’t happen to have given them names.
Now with both of those subtleties highlighted, I’ll once again try to state the claim: roughly speaking, all of the concepts used internally by humans fall into one of a few different “types”, and we have standard ways of describing each of those types of concept with words (again, think nouns, verbs, etc, but also think of the referents of short phrases you can construct from those blocks, like “dog fur” or “the sensation of heat on my toes”). And then one version of the Natural Abstraction Hypothesis would say: those types form a complete typology of the data structures which are natural in our world.
David: Alright, let me have a crack at it. New N.A.H. just dropped: The human mind is a sufficiently general simulator of the world, and fidelitous representations of the world “naturally” decompose into few enough basic types of data structures, that human minds operate all of the data structure types which naturally (efficiently, sufficiently accurately, …) are “found” in the world. When we use language to talk about the world, we are pointing words at these (convergent!) internal data structures. Maybe we don’t have words for certain instances of these data structures, but in principle we can make new words whenever this comes up; we don’t need whole new types of structures.
I have some other issues to bring up, but first: Is this version of the N.A.H. actually true? Do humans actually wield the full set of basic data structures natural for modeling the whole world?
John: Yeah, so that’s a way in which this hypothesis could fail (which, to be clear, I don’t actually expect to be an issue): there could be whole new types of natural concepts which are alien to human minds. In principle, we could discover and analyze those types mathematically, and subjectively they’d be a real mindfuck.
That said, if those sorts of concepts are natural in our world, then it’s kinda weird that human minds weren’t already evolved to leverage them. Of course it’s hard to tell for sure, without some pretty powerful mathematical tools, but I think the evolutionary pressure argument should make us lean against. (Of course a counterargument could be that whole new concept-types have become natural, or will become natural, as a result of major changes in our environment—like e.g. humans or AI taking over the world.)
David: Second genre of objections which seem obvious: Part of the claim here is, “The internal data structures which language can invoke form a set that includes all the natural data-structure types useful/efficient/accurate for representing the world.” But how do we know whether or not our language is so deficient that a fully fleshed out Interoperable Semantics of human languages still has huge blind spots? What if we don’t yet know how to talk about many of the concepts in human cognition, even given the hypothesis that human minds contain all the basic structures relevant for modeling the world? What if nouns, adjectives, verbs, etc.. are an impoverished set of semantic types?
John: That’s the second way the hypothesis could fail: maybe humans already use concepts internally which are totally un-pointable-to using language (or at least anything like current language). Probably many people who are into Eastern spiritual woo would make that claim. Mostly, I expect such woo-folk would be confused about what “pointing to a concept” normally is and how it’s supposed to work: the fact that the internal concept of a dog consists of mostly nonlinguistic stuff does not mean that the word “dog” fails to point at it. And again here, I think there’s a selection pressure argument: a lot of effort by a lot of people, along with a lot of memetic pressure, has gone into trying to linguistically point to humans’ internal concepts.
Suppose there is a whole type of concept which nobody has figured out how to point at (talk about.) Then, either:
Those concepts are not of a natural type so interoperability doesn’t hold and our models of semantics make no guarantees that it should be communicable.
It is a natural type and so is communicable in the Interoperable Semantics sense and so…it’s weird and confusing that people have failed to point to it in this hypothetical?
So basically I claim that human internal concepts are natural and we have spent enough effort as a species trying to talk about them that we’ve probably nailed down pointers to all the basic types.
David: And if human internal concepts are importantly unnatural, well then the N.A.H. fails. Sounds right.
… Wait, our models of semantics should inform fluid mechanics?!?
This post is styled after conversations we’ve had in the course of our research, put together in a way that hopefully highlights a bunch of relatively recent and (ironically) hard-to-articulate ideas around natural abstractions.
John: So we’ve been working a bit on semantics, and also separately on fluid mechanics. Our main goal for both of them is to figure out more of the higher-level natural abstract data structures. But I’m concerned that the two threads haven’t been informing each other as much as they should.
David: Okay…what do you mean by “as much as they should”? I mean, there’s the foundational natural latent framework, and that’s been useful for our thinking on both semantics and fluid mechanics. But beyond that, concretely, in what ways do (should?) semantics and fluid mechanics inform each other?
John: We should see the same types of higher-level data structures across both—e.g. the “geometry + trajectory” natural latents we used in the semantics post should, insofar as the post correctly captures the relevant concepts, generalize to recognizable “objects” in a fluid flow, like eddies (modulo adjustments for nonrigid objects).
David: Sure, I did think it was intuitive to think along those lines as a model for eddies in fluid flow. But in general, why expect to see the same types of data structures for semantics and fluid flow? Why not expect various phenomena in fluid flow to be more suited to representation in some data structures which aren’t the exact same type as those used for the referrents of human words?
John: Specifically, I claim that the types of high-level data structures which are natural for fluid flow should be a subset of the types needed for semantics. If there’s a type of high-level data structure which is natural for fluid flow, but doesn’t match any of the semantic types (noun, verb, adjective, short phrases constructed from those, etc), then that pretty directly disproves at least one version of the natural abstraction hypothesis (and it’s a version which I currently think is probably true).
David: Woah, hold up, that sounds like a very different form of the natural abstraction hypothesis than our audience has heard before! It almost sounds like you’re saying that there are no “non-linguistic concepts”. But I know you actually think that much/most of human cognition routes through “non-linguistic concepts”.
John: Ok, there’s a couple different subtleties here.
First: there’s the distinction between a word or phrase or sentence vs the concept(s) to which it points. Like, the word “dog” evokes this whole concept in your head, this whole “data structure” so to speak, and that data structure is not itself linguistic. It involves visual concepts, probably some unnamed concepts, things which your “inner simulator” can use, etc. Usually when I say that “most human concepts/cognition are not linguistic”, that’s the main thing I’m pointing to.
Second: there’s concepts for which we don’t yet have names, but could assign names to. One easy way to find examples is to look for words in other languages which don’t have any equivalent in our language. The key point about those concepts is that they’re still the same “types of concepts” which we normally assign words to, i.e. they’re still nouns or adjectives or verbs or…, we just don’t happen to have given them names.
Now with both of those subtleties highlighted, I’ll once again try to state the claim: roughly speaking, all of the concepts used internally by humans fall into one of a few different “types”, and we have standard ways of describing each of those types of concept with words (again, think nouns, verbs, etc, but also think of the referents of short phrases you can construct from those blocks, like “dog fur” or “the sensation of heat on my toes”). And then one version of the Natural Abstraction Hypothesis would say: those types form a complete typology of the data structures which are natural in our world.
David: Alright, let me have a crack at it. New N.A.H. just dropped: The human mind is a sufficiently general simulator of the world, and fidelitous representations of the world “naturally” decompose into few enough basic types of data structures, that human minds operate all of the data structure types which naturally (efficiently, sufficiently accurately, …) are “found” in the world. When we use language to talk about the world, we are pointing words at these (convergent!) internal data structures. Maybe we don’t have words for certain instances of these data structures, but in principle we can make new words whenever this comes up; we don’t need whole new types of structures.
I have some other issues to bring up, but first: Is this version of the N.A.H. actually true? Do humans actually wield the full set of basic data structures natural for modeling the whole world?
John: Yeah, so that’s a way in which this hypothesis could fail (which, to be clear, I don’t actually expect to be an issue): there could be whole new types of natural concepts which are alien to human minds. In principle, we could discover and analyze those types mathematically, and subjectively they’d be a real mindfuck.
That said, if those sorts of concepts are natural in our world, then it’s kinda weird that human minds weren’t already evolved to leverage them. Of course it’s hard to tell for sure, without some pretty powerful mathematical tools, but I think the evolutionary pressure argument should make us lean against. (Of course a counterargument could be that whole new concept-types have become natural, or will become natural, as a result of major changes in our environment—like e.g. humans or AI taking over the world.)
David: Second genre of objections which seem obvious: Part of the claim here is, “The internal data structures which language can invoke form a set that includes all the natural data-structure types useful/efficient/accurate for representing the world.” But how do we know whether or not our language is so deficient that a fully fleshed out Interoperable Semantics of human languages still has huge blind spots? What if we don’t yet know how to talk about many of the concepts in human cognition, even given the hypothesis that human minds contain all the basic structures relevant for modeling the world? What if nouns, adjectives, verbs, etc.. are an impoverished set of semantic types?
John: That’s the second way the hypothesis could fail: maybe humans already use concepts internally which are totally un-pointable-to using language (or at least anything like current language). Probably many people who are into Eastern spiritual woo would make that claim. Mostly, I expect such woo-folk would be confused about what “pointing to a concept” normally is and how it’s supposed to work: the fact that the internal concept of a dog consists of mostly nonlinguistic stuff does not mean that the word “dog” fails to point at it. And again here, I think there’s a selection pressure argument: a lot of effort by a lot of people, along with a lot of memetic pressure, has gone into trying to linguistically point to humans’ internal concepts.
Suppose there is a whole type of concept which nobody has figured out how to point at (talk about.) Then, either:
Those concepts are not of a natural type so interoperability doesn’t hold and our models of semantics make no guarantees that it should be communicable.
It is a natural type and so is communicable in the Interoperable Semantics sense and so…it’s weird and confusing that people have failed to point to it in this hypothetical?
So basically I claim that human internal concepts are natural and we have spent enough effort as a species trying to talk about them that we’ve probably nailed down pointers to all the basic types.
David: And if human internal concepts are importantly unnatural, well then the N.A.H. fails. Sounds right.