By the way, an interesting aspect of cortical uniformity is that it’s a giant puzzle piece into which we need to (and haven’t yet) fit every other aspect of human nature and psychology. There should be whole books written on this. Instead, nothing. For example, I have all sorts of social instincts—guilt, the desire to be popular, etc. How exactly does that work? The neocortex knows whether or not I’m popular, but it doesn’t care, because (on this view) it’s just a generic learning algorithm. The old brain cares very much whether I’m popular, but it’s too stupid to understand the world, so how would it know whether I’m popular or not? I’ve casually speculated on this a bit (e.g. here) but it seems like a gaping hole in our understanding of the brain, and you won’t find any answers in Hawkins’s book … or anywhere else as far as I know!
Or maybe that’s a reason for rejecting, in general, the idea that cognition and motivation are handled by separate modules!
I think the way this could, work, conceptually, is as follows. Maybe the Old Brain does have specific “detectors” for specific events like: are people smiling at me, glaring at me, shouting at me, hitting me; has something that was “mine” been stolen from me; is that cluster of sensations an “agent”; does this hurt, or feel good. These seem to be the kinds of events the small children, most mammals, and even some reptiles seem to be able to understand.
The neocortex then constructs increasingly nuanced models based on these base level events. It builds up a fairly sophisticated cognitive behavior such as, for example, romantic jealousy, or the desire to win a game, or the perception that a specific person is a rival, or a long-term plan to get a college degree, by gradually linking up elements of its learned world model with internal imagined expectations of ending up in states that it natively perceives (with the Old Brain) as good or bad.
Obviously the neocortex isn’t just passively learning, it’s also constantly doing forward-modeling/prediction using its learned model to try to navigate toward desirable states. Imagined instances of burning your hand on a stove are linked with real memories of burning your hand on a stove, and thus imagined plans that would lead to burning your hand on the stove are perceived as undesirable, because the Old Brain knows instinctively (i.e. without needing to learn) that this is a bad outcome.
eta: Not wholly my original thought, but I think one of the main purposes of dreams is to provide large amounts of simulated data aimed at linking up the neocortical model of reality with the Old Brain. The sorts of things that happen in dreams tend to often be very dramatic and scary. I think the sleeping brain is intentionally seeking out parts of the state space that agitate the Old Brain in order to link up the map of the outside world with the inner sense of innate goodness and badness.
The idea that the neocortex is running a learning algorithm that needs some kind of evaluative weighting to start working , isn’t exclusive of the idea that the neocortex can learn to perform its own evaluations.
If the brain does something which would be impossible on the assumption of cortical uniformity, then that would indeed be a very good reason to reject cortical uniformity. :-)
If it’s not immediately obvious, with no effort or research whatsoever, how the brain can do something on the assumption of cortical uniformity, I don’t think that’s evidence of much anything!
In fact I came up with a passable attempt at a cortical-uniformity-compatible theory of social instincts pretty much as soon as I tried. It was pretty vague at that point … I didn’t yet know any neuroscience yet … but it was a start. I’ve been iterating since then and feel like I’m making good progress, and that this constraint / assumption is a giant help in making sense of the neuroscientific data, not a handicap. :-)
If the brain does something which would be impossible on the assumption of cortical uniformity, then that would indeed be a very good reason to reject cortical uniformity. :-)
Does it? I don’t think cortical uniformity implies the separation of motivation and mapping
Oh, sorry, I think I misunderstood the first time. Hmm, so in this particular post I was trying to closely follow the book by keeping the neocortex by itself. That’s not how I normally describe things; normally I talk about the neocortex and basal ganglia working together as a subsystem.
So when I think “I really want to get out of debt”, it’s combination of a thing in the world-model, and a motivation / valence. I do in fact think that those two aspects of the thought are anatomically distinct: I think the meaning of “get out of debt” (a complex set of relationships and associations and so on) are stored as synapses in the neocortex, and the fact that we want that is stored as synapses in the basal ganglia (more specifically, the striatum). But obviously the two are very very closely interconnected.
E.g. see here for that more “conventional” RL-style description.
Reward, on the other hand, strikes me as necessarily a very different module. After all, if you only have a learning algorithm that starts from scratch, there’s nothing within that system that can say that making friends is good and getting attacked by lions is bad, as opposed to the other way around. Right? So you need a hardcoded reward-calculator module, seems to me.
Since we seem to be more on the same page than most other people I’ve talked to about this, perhaps a collaboration between us could be fruitful. Not sure on what exactly, but I’ve been thinking about how to transition into direct work on AGI safety since updating in the past couple years that it is potentially even closer than I’d thought.
As for the brain division, I also think of the neocortex and basal ganglia working together as a subsystem. I actually strengthened my belief in their tight coupling in my last year of grad school when I learned more about the striatum gating thoughts (not just motor actions), and the cerebellum smoothing abstract thoughts (not just motor actions). So now I envision it more like the brain is thousands of mostly repetitive loops of little neocortex region → little basal ganglia region → little hindbrain region → little cerebellum region → same little neocortex region, and that these loops also communicate sideways a bit in each region but mostly in the neocortex. With this understanding, I feel like I can’t at all get behind Jeff H’s idea of safely separating out the neocortical functions from the mid/hind brain functions. I think that an effective AGI general learning algorithm is likely to have to have at least some aspects of those little loops, with striatum gating and cerebellar smoothing, and hippocampal memory linkages.… I do think that the available data in neuroscience is very close, if not already, sufficient for describing the necessary algorithm and it’s just a question of a bit more focused work on sorting out the necessary parts from the unneeded complexity. I pulled back from actively trying to do just that once I realized that gaining that knowledge without sufficient safety preparation could be a bad thing for humanity.
Oh wow, cool! I was still kinda confused when I wrote this post and comment thread above, but a couple months later I wrote Big Picture of Phasic Dopamine which sounds at least somewhat related to what you’re talking about and in particular talks a lot about basal ganglia loops.
Oh, except that post leaves out the cerebellum (for simplicity). I have a VERY simple cerebellum story (see the one-sentence version here) … I’ve done some poking around the literature and talking to people about it, but anyway I currently still stand by my story and am still confused about why all the other cerebellum-modelers make things so much more complicated than that. :-P
We do sound on the same page … I’d love to chat, feel free to email or DM me if you have time.
Or maybe that’s a reason for rejecting, in general, the idea that cognition and motivation are handled by separate modules!
I think the way this could, work, conceptually, is as follows. Maybe the Old Brain does have specific “detectors” for specific events like: are people smiling at me, glaring at me, shouting at me, hitting me; has something that was “mine” been stolen from me; is that cluster of sensations an “agent”; does this hurt, or feel good. These seem to be the kinds of events the small children, most mammals, and even some reptiles seem to be able to understand.
The neocortex then constructs increasingly nuanced models based on these base level events. It builds up a fairly sophisticated cognitive behavior such as, for example, romantic jealousy, or the desire to win a game, or the perception that a specific person is a rival, or a long-term plan to get a college degree, by gradually linking up elements of its learned world model with internal imagined expectations of ending up in states that it natively perceives (with the Old Brain) as good or bad.
Obviously the neocortex isn’t just passively learning, it’s also constantly doing forward-modeling/prediction using its learned model to try to navigate toward desirable states. Imagined instances of burning your hand on a stove are linked with real memories of burning your hand on a stove, and thus imagined plans that would lead to burning your hand on the stove are perceived as undesirable, because the Old Brain knows instinctively (i.e. without needing to learn) that this is a bad outcome.
eta: Not wholly my original thought, but I think one of the main purposes of dreams is to provide large amounts of simulated data aimed at linking up the neocortical model of reality with the Old Brain. The sorts of things that happen in dreams tend to often be very dramatic and scary. I think the sleeping brain is intentionally seeking out parts of the state space that agitate the Old Brain in order to link up the map of the outside world with the inner sense of innate goodness and badness.
Yeah, that’s pretty much along the lines that I’m thinking. There are a lot of details to flesh out though. I’ve been working in that direction.
The idea that the neocortex is running a learning algorithm that needs some kind of evaluative weighting to start working , isn’t exclusive of the idea that the neocortex can learn to perform its own evaluations.
If the brain does something which would be impossible on the assumption of cortical uniformity, then that would indeed be a very good reason to reject cortical uniformity. :-)
If it’s not immediately obvious, with no effort or research whatsoever, how the brain can do something on the assumption of cortical uniformity, I don’t think that’s evidence of much anything!
In fact I came up with a passable attempt at a cortical-uniformity-compatible theory of social instincts pretty much as soon as I tried. It was pretty vague at that point … I didn’t yet know any neuroscience yet … but it was a start. I’ve been iterating since then and feel like I’m making good progress, and that this constraint / assumption is a giant help in making sense of the neuroscientific data, not a handicap. :-)
Does it? I don’t think cortical uniformity implies the separation of motivation and mapping
Oh, sorry, I think I misunderstood the first time. Hmm, so in this particular post I was trying to closely follow the book by keeping the neocortex by itself. That’s not how I normally describe things; normally I talk about the neocortex and basal ganglia working together as a subsystem.
So when I think “I really want to get out of debt”, it’s combination of a thing in the world-model, and a motivation / valence. I do in fact think that those two aspects of the thought are anatomically distinct: I think the meaning of “get out of debt” (a complex set of relationships and associations and so on) are stored as synapses in the neocortex, and the fact that we want that is stored as synapses in the basal ganglia (more specifically, the striatum). But obviously the two are very very closely interconnected.
E.g. see here for that more “conventional” RL-style description.
Reward, on the other hand, strikes me as necessarily a very different module. After all, if you only have a learning algorithm that starts from scratch, there’s nothing within that system that can say that making friends is good and getting attacked by lions is bad, as opposed to the other way around. Right? So you need a hardcoded reward-calculator module, seems to me.
Sorry if I’m still misunderstanding.
I also studied neuroscience for several years, and Jeff’s first book was a major inspiration for me beginning that journey. I agree very much with the points you make in this review and in https://www.lesswrong.com/posts/W6wBmQheDiFmfJqZy/brain-inspired-agi-and-the-lifetime-anchor
Since we seem to be more on the same page than most other people I’ve talked to about this, perhaps a collaboration between us could be fruitful. Not sure on what exactly, but I’ve been thinking about how to transition into direct work on AGI safety since updating in the past couple years that it is potentially even closer than I’d thought.
As for the brain division, I also think of the neocortex and basal ganglia working together as a subsystem. I actually strengthened my belief in their tight coupling in my last year of grad school when I learned more about the striatum gating thoughts (not just motor actions), and the cerebellum smoothing abstract thoughts (not just motor actions). So now I envision it more like the brain is thousands of mostly repetitive loops of little neocortex region → little basal ganglia region → little hindbrain region → little cerebellum region → same little neocortex region, and that these loops also communicate sideways a bit in each region but mostly in the neocortex. With this understanding, I feel like I can’t at all get behind Jeff H’s idea of safely separating out the neocortical functions from the mid/hind brain functions. I think that an effective AGI general learning algorithm is likely to have to have at least some aspects of those little loops, with striatum gating and cerebellar smoothing, and hippocampal memory linkages.… I do think that the available data in neuroscience is very close, if not already, sufficient for describing the necessary algorithm and it’s just a question of a bit more focused work on sorting out the necessary parts from the unneeded complexity. I pulled back from actively trying to do just that once I realized that gaining that knowledge without sufficient safety preparation could be a bad thing for humanity.
Oh wow, cool! I was still kinda confused when I wrote this post and comment thread above, but a couple months later I wrote Big Picture of Phasic Dopamine which sounds at least somewhat related to what you’re talking about and in particular talks a lot about basal ganglia loops.
Oh, except that post leaves out the cerebellum (for simplicity). I have a VERY simple cerebellum story (see the one-sentence version here) … I’ve done some poking around the literature and talking to people about it, but anyway I currently still stand by my story and am still confused about why all the other cerebellum-modelers make things so much more complicated than that. :-P
We do sound on the same page … I’d love to chat, feel free to email or DM me if you have time.