That’s super fascinating. I’ve dabbled a bit in all of those parts of your picture and seeing them put together like this feels really illuminating. I’d wish some predictive coding researcher would be so kind to give it a look, maybe somebody here knows someone?
During reading, I was a bit confused about the set of generative models or hypotheses. Do you have an example how this could concretely look like? For example, when somebody tosses me an apple, is there a generative model for different velocities and weights, or one generative model with an uncertainty distribution over those quantities? In the latter case, one would expect another updating-process acting “within” each generative model, right?
I’d wish some predictive coding researcher would be so kind to give it a look, maybe somebody here knows someone?
Yeah, I haven’t had the time or energy to start cold-emailing predictive coding experts etc. Well, I tweet this article at people now and then :-P Also, I’m still learning, the picture is in flux, and in particular I still can’t really put myself in the head of Friston, Clark, etc. so as to write a version of this that’s in their language and speaks to their perspective.
During reading, I was a bit confused about the set of generative models or hypotheses. Do you have an example how this could concretely look like? For example, when somebody tosses me an apple, is there a generative model for different velocities and weights, or one generative model with an uncertainty distribution over those quantities? In the latter case, one would expect another updating-process acting “within” each generative model, right?
I would say that the generative models are a consortium of thousands of glued-together mini-generative-models, maybe even as much as one model per cortical column, which are self-consistent in that they’re not issuing mutually-contradictory predictions (often because any given mini-model simply abstains from making predictions about most things). Some of the mini-model pieces stick around a while, while other pieces get thrown out and replaced constantly, many times per second, either in response to new sensory data or just because the models themselves have time-dependence. Like if someone tosses you an apple, there’s a set of models (say, in language and object-recognition areas) that really just mean “this is an apple” and they’re active the whole time, while there are other models (say, in a sensory-motor area) that say “I will reach out in a certain way and catch the apple and it will feel like this when it touches my hand)”, and some subcomponents of the latter one keep getting edited or replaced as you watch the apple and update your belief about its trajectory. I think “edited or replaced” is the right way to think about it—both can happen—but I won’t say more because now this is getting into low-level gory details that are highly uncertain anyway. :-P
in particular I still can’t really put myself in the head of Friston, Clark, etc. so as to write a version of this that’s in their language and speaks to their perspective.
Just a sidenote, one of my profs is part of the Bayesian CogSci crowd and was fairly frustrated with and critical of both Friston and Clark. We read one of Friston’s papers in our journal club and came away thinking that Friston is reinventing a lot of wheels and using odd terms for known concepts.
For me, this paper by Sam Gershman helped a lot in understanding Friston’s ideas, and this one by Laurence Aitchison and Máté Lengyel was useful, too.
I would say that the generative models are a consortium of thousands of glued-together mini-generative-models
Cool, I like that idea, I previously thought about the models as fairly separated and bulky entities, that sounds much more plausible.
As a (maybe misguided) side comment, model sketches like yours make me intuitively update for shorter AI timelines, because they give me a sense of a maturing field of computational cognitive science. Would be really interested in what others think about that.
I think I’m in a distinct minority on this forum, maybe a minority of 1, in thinking that there’s more than 50% chance that studying and reverse-engineering neocortex algorithms will be the first way we get AGI. (Obviously I’m not the only one in the world with this opinion, just maybe the only one on this forum.)
I think there’s a good outside-view argument, namely this is an active field of research, and at the end of it, we’re all but guaranteed to have AGI-capable algorithms, unlike almost any other research program.
I think there’s an even stronger (to me) inside-view argument, in which cortical uniformity plays a big role, because (1) if one algorithm can learn languages and image-processing and calculus, that puts a ceiling on the level of complexity and detail within that algorithm, and (2) my reading of the literature makes me think that we already understand the algorithm at least vaguely, and the details are starting to crystallize into view on the horizon … although I freely acknowledge that this might just be the Dunning-Kruger talking. :-)
That’s really interesting, I haven’t thought about this much, but it seems very plausible and big if true (though I am likely biased as a Cognitive Science student). Do you think this might be turned into a concrete question to forecast for the Metaculus crowd, i.e. “Reverse-engineering neocortex algorithms will be the first way we get AGI”? The resolution might get messy if an org like DeepMind, with their fair share of computational neuroscientists, will be the ones who get there first, right?
Yeah I think it would be hard to pin down. Obviously AGI will resemble neocortical algorithms in some respects, and obviously it will be different in some respects. For example, the neocortex uses distributed representations, deep neural nets use distributed representations, and the latter was historically inspired by the former, I think. And conversely, no way AGI will have synaptic vesicles! In my mind this probabilistic programming system with no neurons—https://youtu.be/yeDB2SQxCEs—is “more like the neocortex” than a ConvNet, but that’s obviously just a particular thing I have in mind, it’s not an objective assessment of how brain-like something is. Maybe a concrete question would be “Will AGI programmers look back on the 2010s work of people like Dileep George, Randall O’Reilly, etc. as being an important part of their intellectual heritage, or just 2 more of the countless thousands of CS researchers?” But I dunno, and I’m not sure if that’s a good fit for Metaculus anyway.
That’s super fascinating. I’ve dabbled a bit in all of those parts of your picture and seeing them put together like this feels really illuminating. I’d wish some predictive coding researcher would be so kind to give it a look, maybe somebody here knows someone?
During reading, I was a bit confused about the set of generative models or hypotheses. Do you have an example how this could concretely look like? For example, when somebody tosses me an apple, is there a generative model for different velocities and weights, or one generative model with an uncertainty distribution over those quantities? In the latter case, one would expect another updating-process acting “within” each generative model, right?
Thanks!
Yeah, I haven’t had the time or energy to start cold-emailing predictive coding experts etc. Well, I tweet this article at people now and then :-P Also, I’m still learning, the picture is in flux, and in particular I still can’t really put myself in the head of Friston, Clark, etc. so as to write a version of this that’s in their language and speaks to their perspective.
I put more at My Computational Framework for the Brain, although you’ll notice that I didn’t talk about where the generative models come from or their exact structure (which is not entirely known anyway). Three examples I often think about would be: the Dileep George vision model, the active dendrite / cloned HMM sequence learning story (biological implementation by Jeff Hawkins, algorithmic implementation by Dileep George) (note that neither of these have reward), and maybe (well, it’s not that concrete) also my little story about moving your toe.
I would say that the generative models are a consortium of thousands of glued-together mini-generative-models, maybe even as much as one model per cortical column, which are self-consistent in that they’re not issuing mutually-contradictory predictions (often because any given mini-model simply abstains from making predictions about most things). Some of the mini-model pieces stick around a while, while other pieces get thrown out and replaced constantly, many times per second, either in response to new sensory data or just because the models themselves have time-dependence. Like if someone tosses you an apple, there’s a set of models (say, in language and object-recognition areas) that really just mean “this is an apple” and they’re active the whole time, while there are other models (say, in a sensory-motor area) that say “I will reach out in a certain way and catch the apple and it will feel like this when it touches my hand)”, and some subcomponents of the latter one keep getting edited or replaced as you watch the apple and update your belief about its trajectory. I think “edited or replaced” is the right way to think about it—both can happen—but I won’t say more because now this is getting into low-level gory details that are highly uncertain anyway. :-P
Thanks a lot for the elaboration!
Just a sidenote, one of my profs is part of the Bayesian CogSci crowd and was fairly frustrated with and critical of both Friston and Clark. We read one of Friston’s papers in our journal club and came away thinking that Friston is reinventing a lot of wheels and using odd terms for known concepts.
For me, this paper by Sam Gershman helped a lot in understanding Friston’s ideas, and this one by Laurence Aitchison and Máté Lengyel was useful, too.
Cool, I like that idea, I previously thought about the models as fairly separated and bulky entities, that sounds much more plausible.
As a (maybe misguided) side comment, model sketches like yours make me intuitively update for shorter AI timelines, because they give me a sense of a maturing field of computational cognitive science. Would be really interested in what others think about that.
I think I’m in a distinct minority on this forum, maybe a minority of 1, in thinking that there’s more than 50% chance that studying and reverse-engineering neocortex algorithms will be the first way we get AGI. (Obviously I’m not the only one in the world with this opinion, just maybe the only one on this forum.)
I think there’s a good outside-view argument, namely this is an active field of research, and at the end of it, we’re all but guaranteed to have AGI-capable algorithms, unlike almost any other research program.
I think there’s an even stronger (to me) inside-view argument, in which cortical uniformity plays a big role, because (1) if one algorithm can learn languages and image-processing and calculus, that puts a ceiling on the level of complexity and detail within that algorithm, and (2) my reading of the literature makes me think that we already understand the algorithm at least vaguely, and the details are starting to crystallize into view on the horizon … although I freely acknowledge that this might just be the Dunning-Kruger talking. :-)
That’s really interesting, I haven’t thought about this much, but it seems very plausible and big if true (though I am likely biased as a Cognitive Science student). Do you think this might be turned into a concrete question to forecast for the Metaculus crowd, i.e. “Reverse-engineering neocortex algorithms will be the first way we get AGI”? The resolution might get messy if an org like DeepMind, with their fair share of computational neuroscientists, will be the ones who get there first, right?
Yeah I think it would be hard to pin down. Obviously AGI will resemble neocortical algorithms in some respects, and obviously it will be different in some respects. For example, the neocortex uses distributed representations, deep neural nets use distributed representations, and the latter was historically inspired by the former, I think. And conversely, no way AGI will have synaptic vesicles! In my mind this probabilistic programming system with no neurons—https://youtu.be/yeDB2SQxCEs—is “more like the neocortex” than a ConvNet, but that’s obviously just a particular thing I have in mind, it’s not an objective assessment of how brain-like something is. Maybe a concrete question would be “Will AGI programmers look back on the 2010s work of people like Dileep George, Randall O’Reilly, etc. as being an important part of their intellectual heritage, or just 2 more of the countless thousands of CS researchers?” But I dunno, and I’m not sure if that’s a good fit for Metaculus anyway.