I believe the last section of this post is pointing to something central and important which is really difficult to articulate. Which is ironic, since “how does articulating concepts work?” is kinda part of it.
To me, it feels like Bayesianism is missing an API. Getting embeddedness and reflection and communication right all require the model talking about its own API, and that in turn requires figuring out what the API is supposed to be—like how the literal meanings of things passed in and out actually get tied to the world.
Nope, that is not what I’m talking about here. At least I don’t think so. The thing I’m talking about applies even when there’s only one agent; it’s a question of how that agent’s own internal symbols end up connected to physical things in the world, for purposes of the agent’s own reasoning. Honesty when communicating with other agents is related, but sort of tangential.
Aren’t the symbols hardcoded to mean sth? Your parents keep using “Apple” to refer to an apple, and you hardcode that symbol to stand for apples. Of course, the devil is in the details, but I think developmental linguistics should probably have some existing literature, and the question doesn’t seem that mysterious to me.
You have a concept of apples before learning the word (otherwise you wouldn’t know which thing in our very-high-dimensional world to tie the word to; word-learning does not require nearly enough examples to narrow down the concept space without some pre-existing concept). Whatever data structure your brain uses to represent the concept is separate from the word itself, and that’s the thing I’m talking about here.
Well, really I’m talking about the idealized theoretical Bayesian version of that thing. Point is, it should not require other agents in the picture, including your parents.
You have a concept of apples before learning the word (otherwise you wouldn’t know which thing in our very-high-dimensional world to tie the word to; word-learning does not require nearly enough examples to narrow down the concept space without some pre-existing concept).
That doesn’t seem right, intuitively. People (humans) have pre-existing capabilities (‘instincts’), by the time they’re learning words, and one of them is the ability to ‘follow pointing’, i.e. look at something someone else is pointing at. In practice, that can involve considerable iteration, e.g. ‘no not that other round red (or green) thing; this one right here’.
The parts of our minds that learn words also seem to have access to an API for analyzing and then later recognizing specific visual patterns, e.g. shapes, colors, materials, and faces. The internals of that visual-system API are pretty sophisticated too.
Well, really I’m talking about the idealized theoretical Bayesian version of that thing. Point is, it should not require other agents in the picture, including your parents.
Learning language must require other agents, at least indirectly, tho – right? It only exists because some agents use (or used) it.
I’m not talking about learning language, I’m talking about how we chunk the world into objects. It’s not about learning the word “tree”, it’s about recognizing the category-of-things which we happen to call trees. It’s about thinking that maybe the things I know about one of the things-we-call-trees are likely to generalize to other things-I-call-trees. We must do that before attaching the word “tree” to the concept, because otherwise it would take millions of examples to hone in on which concept the word is trying to point to.
I agree that chunking precedes naming – historically. But I think most (a lot?) of people learn the name first and have to (try to) reverse engineer the chunking. Some of this definitely happens iteratively and interactively, e.g. when teaching children.
And I’m very unsure that there is one simple way for “how we chunk the world into objects”. I think that might explain why some people chunk the same words so differently: there’s no (obvious) unique best way to chunk some ideas for everyone.
I know that people that are relatively competent at chess reliably chunk board states in a way that I know that I don’t (as I’m not at all good at chess).
Similarly, people that already knows a variety of different plants (at least) seems to chunk them in a way that I don’t.
We must do that before attaching the word “tree” to the concept, because otherwise it would take millions of examples to hone in on which concept the word is trying to point to.
I don’t think this is true. If anything, some ideas/concepts seem to start with very coarse chunking based on a very small number of prototypical examples, and then it does take ‘millions’ of subsequent examples to refine the chunking. And that is definitely sometimes mediated directly via language.
I think there is a lot of pre-verbal or non-verbal chunking involved in thinking.
But I also think it’s very common to not have a chunk (“concept”) before learning the word, even of something like apples.
Tho I also think the opposite is pretty common – ‘Oh, that’s the word for those!’.
There’s an attention component to chunking. I could chunk some set of things into neat categories – if I examined it closely for a sufficient duration. But I mostly don’t – relative to all possible things I could be examining.
I think you’re not getting something about why the question is an interesting one.
The meaning of “meaning” is a contentious philosophical issue, and although developmental psychology could provide some inspiration, I highly doubt they’d have provided a rigorous formal answer. Saying the word “hardcoded” hardly sheds any light (especially since “hardcoded” usually contrasts with “learned”, and you’re somehow suggesting that we learn hardcoded answers...).
I believe the last section of this post is pointing to something central and important which is really difficult to articulate. Which is ironic, since “how does articulating concepts work?” is kinda part of it.
To me, it feels like Bayesianism is missing an API. Getting embeddedness and reflection and communication right all require the model talking about its own API, and that in turn requires figuring out what the API is supposed to be—like how the literal meanings of things passed in and out actually get tied to the world.
I agree, it’s important to create, or at least detect, well-aligned agents. You suggest we need an honesty API.
Nope, that is not what I’m talking about here. At least I don’t think so. The thing I’m talking about applies even when there’s only one agent; it’s a question of how that agent’s own internal symbols end up connected to physical things in the world, for purposes of the agent’s own reasoning. Honesty when communicating with other agents is related, but sort of tangential.
Aren’t the symbols hardcoded to mean sth? Your parents keep using “Apple” to refer to an apple, and you hardcode that symbol to stand for apples. Of course, the devil is in the details, but I think developmental linguistics should probably have some existing literature, and the question doesn’t seem that mysterious to me.
You have a concept of apples before learning the word (otherwise you wouldn’t know which thing in our very-high-dimensional world to tie the word to; word-learning does not require nearly enough examples to narrow down the concept space without some pre-existing concept). Whatever data structure your brain uses to represent the concept is separate from the word itself, and that’s the thing I’m talking about here.
Well, really I’m talking about the idealized theoretical Bayesian version of that thing. Point is, it should not require other agents in the picture, including your parents.
That doesn’t seem right, intuitively. People (humans) have pre-existing capabilities (‘instincts’), by the time they’re learning words, and one of them is the ability to ‘follow pointing’, i.e. look at something someone else is pointing at. In practice, that can involve considerable iteration, e.g. ‘no not that other round red (or green) thing; this one right here’.
The parts of our minds that learn words also seem to have access to an API for analyzing and then later recognizing specific visual patterns, e.g. shapes, colors, materials, and faces. The internals of that visual-system API are pretty sophisticated too.
Learning language must require other agents, at least indirectly, tho – right? It only exists because some agents use (or used) it.
But I’m skeptical that an ‘idealized theoretical Bayesian agent’ could learn language on its own – there is no such thing as “an ideal philosophy student of perfect emptiness”.
I’m not talking about learning language, I’m talking about how we chunk the world into objects. It’s not about learning the word “tree”, it’s about recognizing the category-of-things which we happen to call trees. It’s about thinking that maybe the things I know about one of the things-we-call-trees are likely to generalize to other things-I-call-trees. We must do that before attaching the word “tree” to the concept, because otherwise it would take millions of examples to hone in on which concept the word is trying to point to.
I agree that chunking precedes naming – historically. But I think most (a lot?) of people learn the name first and have to (try to) reverse engineer the chunking. Some of this definitely happens iteratively and interactively, e.g. when teaching children.
And I’m very unsure that there is one simple way for “how we chunk the world into objects”. I think that might explain why some people chunk the same words so differently: there’s no (obvious) unique best way to chunk some ideas for everyone.
I know that people that are relatively competent at chess reliably chunk board states in a way that I know that I don’t (as I’m not at all good at chess).
Similarly, people that already knows a variety of different plants (at least) seems to chunk them in a way that I don’t.
I don’t think this is true. If anything, some ideas/concepts seem to start with very coarse chunking based on a very small number of prototypical examples, and then it does take ‘millions’ of subsequent examples to refine the chunking. And that is definitely sometimes mediated directly via language.
I think there is a lot of pre-verbal or non-verbal chunking involved in thinking.
But I also think it’s very common to not have a chunk (“concept”) before learning the word, even of something like apples.
Tho I also think the opposite is pretty common – ‘Oh, that’s the word for those!’.
There’s an attention component to chunking. I could chunk some set of things into neat categories – if I examined it closely for a sufficient duration. But I mostly don’t – relative to all possible things I could be examining.
I think you’re not getting something about why the question is an interesting one.
The meaning of “meaning” is a contentious philosophical issue, and although developmental psychology could provide some inspiration, I highly doubt they’d have provided a rigorous formal answer. Saying the word “hardcoded” hardly sheds any light (especially since “hardcoded” usually contrasts with “learned”, and you’re somehow suggesting that we learn hardcoded answers...).