How do the statistically-oriented theories of pragmatics and the linguistic theories of semantics go together?
Math semantics, in the denotational and operational senses, I kinda understand: you demonstrate the semantics of a mathematical system by providing some outside mathematical object which models it. This also works for CS semantics, but does come with the notion that we include \Bot as an element of our denotational domains and that our semantics may bottom out in “the machine does things”, ie: translation to opcodes.
The philosophical approach seems to wave words around like they’re not talking about how to make words mean things, or go reference the mathematical approach. I again wish to reference Plato’s Camera, and go with Domain Portrayal Semantics. That at least gives us a good guess to talk about how and why symbol grounding makes sense, as a feature of cognition that must necessarily happen in order for a mind to work.
What you said about the statistical nature of real cognition would be considered, in cognitive psychology, as just one persepective on the issue: alas there are many.
Nonetheless, it is considered one of the better-supported hypotheses in cognitive science and theoretical neuroscience.
At this point in time I can only say that in my despair at the hugeness of this issue leaves me with nothing much more to say except that I am trying to write that book, but I might never get around to it. And in the mean time I can only try, for my part, to write some answers to more specific questions within that larger whole.
There are really two aspects to semantics: grounding and compositionality. Elementary distinction, of course, but with some hidden subtlety to it … because many texts focus on one of them and do a quick wave of the hand at the other (it is usually the grounding aspect that gets short shrift, while the compositionality aspect takes center stage).
[Quick review for those who might need it: grounding is the question of how (among other things) the basic terms of your language or concept-encoding system map onto “things in the world”, whereas compositionality is how it is that combinations of basic terms/concepts can ‘mean’ something in such a way that the meaning of a combination can be derived from the meaning of the constituents plus the arrangement of the constituents.]
So, having said that, a few observations.
Denotational and operational semantics of programming languages or formal systems ….. well, there we have a bit of a closed universe, no? And things get awfully (deceptively) easy when we drop down into closed universes. (As Winograd and the other Blocks Worlds enthusiasts realized rather quickly). You hinted at that with your comment when you said:
… and that our semantics may bottom out in “the machine does things”, ie: translation to opcodes.
We can then jump straight from too simple to ridiculously abstract, finding ourselves listening to philosophical explanations of semantics, on which subject you said:
The philosophical approach seems to wave words around like they’re not talking about how to make words mean things...
Concisely put, and I am not sure I disagree (too much, at any rate).
Then we can jump sideways to psychology (and I will lump neuroscientists/neurophilosophers like Patricia Churchland in with the psychologists). I haven’t read any of PC’s stuff for quite a while, but Plato’s Camera does look to be above-average quality so I might give it a try. However, looking at the link you supplied I was able to grok where she was coming from with Domain Portrayal Semantics, and I have to say that there are some problems with that. (She may deal with the problems later, I don’t know, say that the following as provisional.)
Her idea of a Domain Portrayal Semantics is very static: just a state-space divide-and-conquer, really. The problem with that is that in real psychological contexts people often regard concepts as totally malleable in all sorts of ways. They shift the boundaries around over time, in different contexts, and with different attitudes. So, for example, I can take you into my workshop which is undergoing renovation at the moment and, holding in my hand a takeout meal for you and the other visitors, I can say “find some chairs, a lamp, and a dining table”. There are zero chairs, lamps and dining tables in the room. But, faced with the takeout that is getting cold, you look around and find (a) a railing sticking out of the wall, which becomes a chair because you can kinda sit on it, (b) a blowtorch that can supply light, and (c) a tea chest with a pile of stuff on it, from which the stuff can be removed to make a dining table. All of those things can be justifiably called chairs tables and lamps because of their functionality.
I am sure her idea could be extended to allow for this kind of malleability, but the bottom line is that you then build your semantics on some very shifty sort of sand, not the rock that maybe everyone was hoping for.
(I have to cut off this reply to go do a task. Hopefully get back to it later).
Plato’s Camera is well above average for a philosophy-of-mind book, but I still think it focuses too thoroughly on relatively old knowledge about what we can do with artificial neural networks, both supervised and unsupervised. My Kindle copy includes angry notes to the effect of, “If you claim we can do linear transformations on vector-space ‘maps’ to check by finding a homomorphism when they portray the same objective feature-domain, how the hell can you handle Turing-complete domains!? The equivalence of lambda expressions is undecidable!”
This is why I’m very much a fan of the probabilistic programming approach to computational cognitive science, which clears up these kinds of issues. In a probabilistic programming setting, the probability of extensional equality for two models (where models are distributions over computation traces) is a dead simple and utterly normal query: it’s just p(X == Y), where X and Y are taken to be models (aka: thunk lambdas, aka: distributions from which we can sample). The undecidable question is thus shunted aside in favor of a check that is merely computationally intensive, but can ultimately be done in a bounded-rational way.
My reaction to those simple neural-net accounts of cognition is similar, in that I wanted very much to overcome their (pretty glaring) limitations. I wasn’t so much concerned with inability to handle Turing complete domains, as other more practical issues. But I came to a different conclusion about the value of probabilistic programming approaches, because that seems to force the real world to conform to the idealized world of a branch of mathematics, and, like Leonardo, I don’t like telling Nature what she should be doing with her designs. ;-)
Under the heading of ‘interesting history’ it might be worth mentioning that I hit my first frustration with neural nets at the very time that it was bursting into full bloom—I was part of the revolution that shook cognitive science in the mid to late 1980s. Even while it was in full swing, I was already going beyond it. And I have continued on that path ever since. Tragically, the bulk of NN researchers stayed loyal to the very simplistic systems invented in the first blush of that spring, and never seemed to really understand that they had boxed themselves into a dead end.
But I came to a different conclusion about the value of probabilistic programming approaches, because that seems to force the real world to conform to the idealized world of a branch of mathematics, and, like Leonardo, I don’t like telling Nature what she should be doing with her designs. ;-)
And I have continued on that path ever since. Tragically, the bulk of NN researchers stayed loyal to the very simplistic systems invented in the first blush of that spring, and never seemed to really understand that they had boxed themselves into a dead end.
Could you explain the kinds of neural networks beyond the standard feedforward, convolutional, and recurrent supervised networks? In particular, I’d really appreciating hearing a connectionist’s view on how unsupervised neural networks can learn to convert low-level sensory features into the kind of more abstracted, “objectified” (in the sense of “made objective”) features that can be used for the bottom, most concrete layer of causal modelling.
Ah, but Nature’s elegant design for an embodied creature is precisely a bounded-Bayesian reasoner! You just minimize the free energy of the environment.
Yikes! No. :-)
That paper couldn’t be a more perfect example of what I meant when I said
that seems to force the real world to conform to the idealized world of a branch of mathematics
In other words, the paper talks about a theoretical entity which is a descriptive model (not a functional model) of one aspect of human decision making behavior. That means you cannot jump to the conclusion that this is “nature’s design for an embodide creature”.
About your second question. I can only give you an overview, but the essential ingredient is that to go beyond the standard neural nets you need to consider neuron-like objects that are actually free to be created and destroyed like processes on a network, and which interact with one another using more elaborate, generalized versions of the rules that govern simple nets.
From there it is easy to get to unsupervised concept building because the spontaneous activity of these atoms (my preferred term) involves searching for minimum-energy* configurations that describe the world.
There is actually more than one type of ‘energy’ being simultaneously minimized in the systems I work on.
You can read a few more hints of this stuff in my 2010 paper with Trevor Harley (which is actually on a different topic, but I threw in a sketch of the cognitive system for purposes of illustrating my point in that paper).
Reference:
Loosemore, R.P.W. & Harley, T.A. (2010). Brains and Minds: On the Usefulness of Localisation Data to Cognitive Psychology. In M. Bunzl & S.J. Hanson (Eds.), Foundational Issues of Neuroimaging. Cambridge, MA: MIT Press. http://richardloosemore.com/docs/2010a_BrainImaging_rpwl_tah.pdf
Ok, let me continue to ask questions.
How do the statistically-oriented theories of pragmatics and the linguistic theories of semantics go together?
Math semantics, in the denotational and operational senses, I kinda understand: you demonstrate the semantics of a mathematical system by providing some outside mathematical object which models it. This also works for CS semantics, but does come with the notion that we include
\Bot
as an element of our denotational domains and that our semantics may bottom out in “the machine does things”, ie: translation to opcodes.The philosophical approach seems to wave words around like they’re not talking about how to make words mean things, or go reference the mathematical approach. I again wish to reference Plato’s Camera, and go with Domain Portrayal Semantics. That at least gives us a good guess to talk about how and why symbol grounding makes sense, as a feature of cognition that must necessarily happen in order for a mind to work.
Nonetheless, it is considered one of the better-supported hypotheses in cognitive science and theoretical neuroscience.
Fair enough.
There are really two aspects to semantics: grounding and compositionality. Elementary distinction, of course, but with some hidden subtlety to it … because many texts focus on one of them and do a quick wave of the hand at the other (it is usually the grounding aspect that gets short shrift, while the compositionality aspect takes center stage).
[Quick review for those who might need it: grounding is the question of how (among other things) the basic terms of your language or concept-encoding system map onto “things in the world”, whereas compositionality is how it is that combinations of basic terms/concepts can ‘mean’ something in such a way that the meaning of a combination can be derived from the meaning of the constituents plus the arrangement of the constituents.]
So, having said that, a few observations.
Denotational and operational semantics of programming languages or formal systems ….. well, there we have a bit of a closed universe, no? And things get awfully (deceptively) easy when we drop down into closed universes. (As Winograd and the other Blocks Worlds enthusiasts realized rather quickly). You hinted at that with your comment when you said:
We can then jump straight from too simple to ridiculously abstract, finding ourselves listening to philosophical explanations of semantics, on which subject you said:
Concisely put, and I am not sure I disagree (too much, at any rate).
Then we can jump sideways to psychology (and I will lump neuroscientists/neurophilosophers like Patricia Churchland in with the psychologists). I haven’t read any of PC’s stuff for quite a while, but Plato’s Camera does look to be above-average quality so I might give it a try. However, looking at the link you supplied I was able to grok where she was coming from with Domain Portrayal Semantics, and I have to say that there are some problems with that. (She may deal with the problems later, I don’t know, say that the following as provisional.)
Her idea of a Domain Portrayal Semantics is very static: just a state-space divide-and-conquer, really. The problem with that is that in real psychological contexts people often regard concepts as totally malleable in all sorts of ways. They shift the boundaries around over time, in different contexts, and with different attitudes. So, for example, I can take you into my workshop which is undergoing renovation at the moment and, holding in my hand a takeout meal for you and the other visitors, I can say “find some chairs, a lamp, and a dining table”. There are zero chairs, lamps and dining tables in the room. But, faced with the takeout that is getting cold, you look around and find (a) a railing sticking out of the wall, which becomes a chair because you can kinda sit on it, (b) a blowtorch that can supply light, and (c) a tea chest with a pile of stuff on it, from which the stuff can be removed to make a dining table. All of those things can be justifiably called chairs tables and lamps because of their functionality.
I am sure her idea could be extended to allow for this kind of malleability, but the bottom line is that you then build your semantics on some very shifty sort of sand, not the rock that maybe everyone was hoping for.
(I have to cut off this reply to go do a task. Hopefully get back to it later).
Plato’s Camera is well above average for a philosophy-of-mind book, but I still think it focuses too thoroughly on relatively old knowledge about what we can do with artificial neural networks, both supervised and unsupervised. My Kindle copy includes angry notes to the effect of, “If you claim we can do linear transformations on vector-space ‘maps’ to check by finding a homomorphism when they portray the same objective feature-domain, how the hell can you handle Turing-complete domains!? The equivalence of lambda expressions is undecidable!”
This is why I’m very much a fan of the probabilistic programming approach to computational cognitive science, which clears up these kinds of issues. In a probabilistic programming setting, the probability of extensional equality for two models (where models are distributions over computation traces) is a dead simple and utterly normal query: it’s just
p(X == Y)
, where X and Y are taken to be models (aka: thunk lambdas, aka: distributions from which we can sample). The undecidable question is thus shunted aside in favor of a check that is merely computationally intensive, but can ultimately be done in a bounded-rational way.My reaction to those simple neural-net accounts of cognition is similar, in that I wanted very much to overcome their (pretty glaring) limitations. I wasn’t so much concerned with inability to handle Turing complete domains, as other more practical issues. But I came to a different conclusion about the value of probabilistic programming approaches, because that seems to force the real world to conform to the idealized world of a branch of mathematics, and, like Leonardo, I don’t like telling Nature what she should be doing with her designs. ;-)
Under the heading of ‘interesting history’ it might be worth mentioning that I hit my first frustration with neural nets at the very time that it was bursting into full bloom—I was part of the revolution that shook cognitive science in the mid to late 1980s. Even while it was in full swing, I was already going beyond it. And I have continued on that path ever since. Tragically, the bulk of NN researchers stayed loyal to the very simplistic systems invented in the first blush of that spring, and never seemed to really understand that they had boxed themselves into a dead end.
Ah, but Nature’s elegant design for an embodied creature is precisely a bounded-Bayesian reasoner! You just minimize the free energy of the environment.
Could you explain the kinds of neural networks beyond the standard feedforward, convolutional, and recurrent supervised networks? In particular, I’d really appreciating hearing a connectionist’s view on how unsupervised neural networks can learn to convert low-level sensory features into the kind of more abstracted, “objectified” (in the sense of “made objective”) features that can be used for the bottom, most concrete layer of causal modelling.
Yikes! No. :-)
That paper couldn’t be a more perfect example of what I meant when I said
In other words, the paper talks about a theoretical entity which is a descriptive model (not a functional model) of one aspect of human decision making behavior. That means you cannot jump to the conclusion that this is “nature’s design for an embodide creature”.
About your second question. I can only give you an overview, but the essential ingredient is that to go beyond the standard neural nets you need to consider neuron-like objects that are actually free to be created and destroyed like processes on a network, and which interact with one another using more elaborate, generalized versions of the rules that govern simple nets.
From there it is easy to get to unsupervised concept building because the spontaneous activity of these atoms (my preferred term) involves searching for minimum-energy* configurations that describe the world.
There is actually more than one type of ‘energy’ being simultaneously minimized in the systems I work on.
You can read a few more hints of this stuff in my 2010 paper with Trevor Harley (which is actually on a different topic, but I threw in a sketch of the cognitive system for purposes of illustrating my point in that paper).
Reference: Loosemore, R.P.W. & Harley, T.A. (2010). Brains and Minds: On the Usefulness of Localisation Data to Cognitive Psychology. In M. Bunzl & S.J. Hanson (Eds.), Foundational Issues of Neuroimaging. Cambridge, MA: MIT Press. http://richardloosemore.com/docs/2010a_BrainImaging_rpwl_tah.pdf