In particular, most noun-words should presumably ground out in something-like-clusters over “data” from the physical world, and then other kinds of words ground out in things like clusters-of-trajectories or clusters-of-modular-representations-of-many-subclusters or things along those lines.
I’m concerned that you’re underestimating how human-centric and speaker-centric many (most?) nouns are.
One class of examples involve valence. Consider the word “contamination”. “Contamination” has to be bad (= negative valence) by definition; otherwise you would use a different word. For example, nobody would say “the French fries got contaminated by a delicious pinch of salt”. There are many things like that—more detail in my recent post here (§3.4). Anyway, I claim that valence-from-Person-X’s-perspective isn’t compactly describable in terms of real-world stuff. The natural description of valence is the output of an algorithm in Person X’s brain, an algorithm related to TD learning with a pretty complicated reward function that has been running for the speaker’s whole lifetime up to this moment.
(This example is particularly important because if a human says “I like human flourishing”, the way this cashes out is intimately tied to “flourishing” being tied to positive valence, so it’s a little bit like tautologically saying “I like [stuff-that-I-like]”, and thus a flourishing-maximizing AGI can’t make appropriate real-world decisions unless it has captured everything about human valence, which might need to include out-of-distribution stuff which the actual valence algorithm can obviously spit out when appropriately queried, but which can’t necessarily be figured out by interpolation from existing human-provided data in the absence of a running human valence algorithm. I think.)
Another class of examples involve frame semantics and (relatedly) analogical reasoning. If you’re not familiar with frame semantics, I have a one-paragraph summary here. If I call a blob of atoms a “guest”, not just a “person”, I’m making a claim that it is appropriate to conceive of the situation, and their role in it, in a certain frame, i.e. by analogy to a certain class of other previous situations that the listener is already familiar with. Hofstadter has a whole book on how analogies are central to how nouns (and verbs, and thoughts more generally) work, a book which I have been gradually reading over the past year or two (the book is good but annoyingly longwinded).
Another (related) issue is what the speaker and listener are paying attention to. If there’s a bowl of soup on the table, you can think of it as “a bowl of soup”, or “an object being cooled convectively”, or “a health food”, or “a spicy food”, or “a dinner food”, or “a red object”, etc.
I’ll clarify which things I do and do not expect a “first-pass” version of reductive semantics (let’s call it reductive semantics 1.0) would handle.
As a very loose heuristic… you know the trope where some character takes everything completely literally? That’s reductive semantics 1.0. So this is definitely not something which would handle all of language semantics, and definitely not something where we should just directly take the reductive-semantics-1.0 unpacking of “human flourishing” and aim an AI at it. The main purpose of the project would be to mine language-structure for a lot of bits of information for theory-building purposes, not to directly solve alignment with it.
That said, I do expect that reductive semantics 1.0 would handle a lot of the pieces you’re pointing to, at least to some extent.
Speaker centric: I definitely expect reductive semantics 1.0 to involve the semantic equivalent of libraries which are imported by default unless someone specifies otherwise, and those libraries would definitely include pointers to the speaker and listener (“I” and “you”, etc). And of course a lot of language—probably the majority in practice—would hook into those pointers.
Valence: I would expect reductive semantics 1.0 to handle connotations in general, including valence connotations. (Yes, I said above that “completely literally” is a good heuristic for reductive semantics 1.0, but realistically I do not think one can actually do “completely literal” semantics properly without also handling a lot of connotations.) In the case of valence specifically, the connotation would be something roughly like “speaker is an agent (always assumed background), and this has a negative impact on their goals/utility/whatever”. That does not actually require mapping out the whole valence-learning algorithm in the human brain. Consider: I can mathematically state that some change decreases a certain agent’s utility without having to know what that agent’s utility function is or how it’s implemented or anything like that, though I do need to make some assumption about the type-signature of the agent’s goal-structure (e.g. that it’s a utility function, in this oversimplified example).
Frame semantics and analogies: a very core part of any decent model of semantics will be which things a word implies and/or binds to in context—e.g. the context implied by “guest”, or the subject/object to which various words bind. So that would definitely be a core part of reductive semantics 1.0. And once that structure is present, a lot of analogy naturally appears by using words in atypical contexts, and then binding them to most-probable matches in those contexts.
What the speaker/listener are paying attention to: I don’t expect this to be very centrally relevant to reductive semantics 1.0, other than maybe for word sense disambiguation. There are a huge number of properties/frames one can apply to any given object, and that’s something I expect would naturally drop out of reductive semantics 1.0, but mostly in an implicit rather than explicit way.
A likely-relevant point underlying all of this: I do expect all of this structure to naturally drop out of clustering-like structures on probabilistic world models, once one moves past the simplest possible clusters and starts doing things like “clusters of trajectories of clusters” or “clusters of components of modular representations of many subclusters” or “clusters of joint compressions of members of two different clusters” or other such higher-level things on top of the foundational notion of clusters. Some of them require some specific structure in the underlying world model (like e.g. a self-model), but the most central pieces (like e.g. frame semantics) mostly seem-to-me like they should naturally stem from higher-level clustering-like operations.
(Also, side note: I’m wording things here in terms of “clusters” because I expect readers are more accustomed to thinking of word-meanings that way, but in my own work I think in terms of natural latents rather than clusters.)
the connotation would be something roughly like “speaker is an agent (always assumed background), and this has a negative impact on their goals/utility/whatever”. That does not actually require mapping out the whole valence-learning algorithm in the human brain.
I agree that understanding “flourishing-as-understood-by-the-speaker” can just have a pointer to the speaker’s goals/utility/valence/whatever; I was saying that maximizing “flourishing-as-understood-by-the-speaker” needs to be able to query/unpack those goals/utility/valence/whatever in detail. (I think we’re on the same page here.)
I’m concerned that you’re underestimating how human-centric and speaker-centric many (most?) nouns are.
One class of examples involve valence. Consider the word “contamination”. “Contamination” has to be bad (= negative valence) by definition; otherwise you would use a different word. For example, nobody would say “the French fries got contaminated by a delicious pinch of salt”. There are many things like that—more detail in my recent post here (§3.4). Anyway, I claim that valence-from-Person-X’s-perspective isn’t compactly describable in terms of real-world stuff. The natural description of valence is the output of an algorithm in Person X’s brain, an algorithm related to TD learning with a pretty complicated reward function that has been running for the speaker’s whole lifetime up to this moment.
(This example is particularly important because if a human says “I like human flourishing”, the way this cashes out is intimately tied to “flourishing” being tied to positive valence, so it’s a little bit like tautologically saying “I like [stuff-that-I-like]”, and thus a flourishing-maximizing AGI can’t make appropriate real-world decisions unless it has captured everything about human valence, which might need to include out-of-distribution stuff which the actual valence algorithm can obviously spit out when appropriately queried, but which can’t necessarily be figured out by interpolation from existing human-provided data in the absence of a running human valence algorithm. I think.)
Another class of examples involve frame semantics and (relatedly) analogical reasoning. If you’re not familiar with frame semantics, I have a one-paragraph summary here. If I call a blob of atoms a “guest”, not just a “person”, I’m making a claim that it is appropriate to conceive of the situation, and their role in it, in a certain frame, i.e. by analogy to a certain class of other previous situations that the listener is already familiar with. Hofstadter has a whole book on how analogies are central to how nouns (and verbs, and thoughts more generally) work, a book which I have been gradually reading over the past year or two (the book is good but annoyingly longwinded).
Another (related) issue is what the speaker and listener are paying attention to. If there’s a bowl of soup on the table, you can think of it as “a bowl of soup”, or “an object being cooled convectively”, or “a health food”, or “a spicy food”, or “a dinner food”, or “a red object”, etc.
I’ll clarify which things I do and do not expect a “first-pass” version of reductive semantics (let’s call it reductive semantics 1.0) would handle.
As a very loose heuristic… you know the trope where some character takes everything completely literally? That’s reductive semantics 1.0. So this is definitely not something which would handle all of language semantics, and definitely not something where we should just directly take the reductive-semantics-1.0 unpacking of “human flourishing” and aim an AI at it. The main purpose of the project would be to mine language-structure for a lot of bits of information for theory-building purposes, not to directly solve alignment with it.
That said, I do expect that reductive semantics 1.0 would handle a lot of the pieces you’re pointing to, at least to some extent.
Speaker centric: I definitely expect reductive semantics 1.0 to involve the semantic equivalent of libraries which are imported by default unless someone specifies otherwise, and those libraries would definitely include pointers to the speaker and listener (“I” and “you”, etc). And of course a lot of language—probably the majority in practice—would hook into those pointers.
Valence: I would expect reductive semantics 1.0 to handle connotations in general, including valence connotations. (Yes, I said above that “completely literally” is a good heuristic for reductive semantics 1.0, but realistically I do not think one can actually do “completely literal” semantics properly without also handling a lot of connotations.) In the case of valence specifically, the connotation would be something roughly like “speaker is an agent (always assumed background), and this has a negative impact on their goals/utility/whatever”. That does not actually require mapping out the whole valence-learning algorithm in the human brain. Consider: I can mathematically state that some change decreases a certain agent’s utility without having to know what that agent’s utility function is or how it’s implemented or anything like that, though I do need to make some assumption about the type-signature of the agent’s goal-structure (e.g. that it’s a utility function, in this oversimplified example).
Frame semantics and analogies: a very core part of any decent model of semantics will be which things a word implies and/or binds to in context—e.g. the context implied by “guest”, or the subject/object to which various words bind. So that would definitely be a core part of reductive semantics 1.0. And once that structure is present, a lot of analogy naturally appears by using words in atypical contexts, and then binding them to most-probable matches in those contexts.
What the speaker/listener are paying attention to: I don’t expect this to be very centrally relevant to reductive semantics 1.0, other than maybe for word sense disambiguation. There are a huge number of properties/frames one can apply to any given object, and that’s something I expect would naturally drop out of reductive semantics 1.0, but mostly in an implicit rather than explicit way.
A likely-relevant point underlying all of this: I do expect all of this structure to naturally drop out of clustering-like structures on probabilistic world models, once one moves past the simplest possible clusters and starts doing things like “clusters of trajectories of clusters” or “clusters of components of modular representations of many subclusters” or “clusters of joint compressions of members of two different clusters” or other such higher-level things on top of the foundational notion of clusters. Some of them require some specific structure in the underlying world model (like e.g. a self-model), but the most central pieces (like e.g. frame semantics) mostly seem-to-me like they should naturally stem from higher-level clustering-like operations.
(Also, side note: I’m wording things here in terms of “clusters” because I expect readers are more accustomed to thinking of word-meanings that way, but in my own work I think in terms of natural latents rather than clusters.)
I agree that understanding “flourishing-as-understood-by-the-speaker” can just have a pointer to the speaker’s goals/utility/valence/whatever; I was saying that maximizing “flourishing-as-understood-by-the-speaker” needs to be able to query/unpack those goals/utility/valence/whatever in detail. (I think we’re on the same page here.)