One thing that strikes me as odd about this model is that it doesn’t have the blessing of dimensionality—each plan is one loop, and evaluating feedback to a winning plan just involves feedback to one loop. When it’s general reward we can simplify this with just rewarding recent winning plans, but in some places it seems like you do imply highly specific feedback, for which you need N feedback channels to give feedback on ~N possible plans. The “blessing of dimensionality” kicks in when you can use more diverse combinations of a smaller number of feedback channels to encode more specific feedback.
Maybe what seems to be specific feedback is actually a smaller number of general types? Like rather than specific feedback to snake-fleeing plans or whatever, a broad signal (like how Success-In-Life Reward is a general signal rewarding whatever just got planned) could be sent out that means “whatever the amygdala just did to make the snake go away, good job” (or something). Note that I have no idea what I’m talking about.
Right, so I’m saying that the “supervised learning loops” get highly specific feedback, e.g. “if you get whacked in the head, then you should have flinched a second or two ago”, “if a salty taste is in your mouth, then you should have salivated a second or two ago”, “if you just started being scared, then you should have been scared a second or two ago”, etc. etc. That’s the part that I’m saying trains the amygdala and agranular prefrontal cortex.
Then I’m suggesting that the Success-In-Life thing is a 1D reward signal to guide search in a high-dimensional space of possible thoughts to think, just like RL. In this case, it’s not “each plan is one loop”, because there’s a combinatorial explosion of possible thoughts you can think, and there are not enough loops for that. (It also wouldn’t work because for pretty much every thought you think, you’ve never thought that exact thought before—like you’ve never put on this particular jacket while humming this particular song and musing about this particular upcoming party...) Instead I think compositionality is involved, such that one plan / thought can involve many simultaneous loops.
How does the section of the amygdala that a particular dopamine neuron connects to even get trained to do the right thing in the first place? It seems like there should be enough chance in connections that there’s really only this one neuron linking a brainstem’s particular output to this specific spot in the amygdala—it doesn’t have a whole bundle of different signals available to send to this exact spot.
SL in the brain seems tricky because not only does the brainstem have to reinforce behaviors in appropriate contexts, it might have to train certain outputs to correspond to certain behaviors in the first place, all with only one wire to each location! Maybe you could do this with a single signal that means both “imitate the current behavior” and also “learn to do your behavior in this context”? Alternatively we might imagine some separate mechanism for of priming the developing amygdala to start out with a diverse yet sensible array of behavior proposals, and the brainstem could learn what its outputs correspond to and then signal them appropriately.
I’m proposing that (1) the hypothalamus has an input slot for “flinch now”, (2) VTA has an output signal for “should have flinched”, (3) there is a bundle of partially-redundant side-by-side loops (see the “probability distribution” comment) that connect specifically to both (1) and (2), by a genetically-hardcoded mechanism.
I take your comment to be saying: Wouldn’t it be hard for the brain to orchestrate such a specific pair of connections across a considerable distance?
Well, I’m very much not an expert on how the brain wires itself up. But I think there’s gotta be some way that it can do things like that. I feel like those kinds of feats of wiring are absolutely required for all kinds of reasons. Like, I think motor cortex connects directly to spinal hand-control nerves, but not foot-control nerves. How do the output neurons aim their paths so accurately, such that they don’t miss and connect to the foot nerves by mistake? Um, I don’t know, but it’s clearly possible. “Molecular signaling” or something, I guess?
Alternatively we might imagine some separate mechanism for of priming the developing amygdala to start out with a diverse yet sensible array of behavior proposals, and the brainstem could learn what its outputs correspond to and then signal them appropriately.
Hmm, one reasonable (to me) possibility along these lines would be something like: “VTA has 20 dopamine output signals, and they’re guided to wind up spread out across the amygdala, but not with surgical precision. Meanwhile the corresponding amygdala loops terminate in an “input zone” of the lateral hypothalamus, but not to any particular spot, instead they float around unsure of exactly what hypothalamus “entry point” to connect to. And there are 20 of these intended “entry points” (collections of neurons for flinching, scowling, etc.). OK, then during embryonic development, the entry-point neurons are firing randomly, and that signal goes around the loop—within the hypothalamus and to VTA, then up to the amygdala, then back down to that floating neuron. Then Hebbian learning—i.e. matching the random code—helps the right loop neuron find its way to the matching hypothalamus entry point.”
I’m not sure if that’s exactly what you’re proposing, but that seems like a perfectly plausible way for the brain to orchestrate these connections during embryonic development. I do have a hunch that this isn’t what happens, that the real mechanism is “molecular signaling” instead. But like I said, I’m not an expert, and I certainly wouldn’t be shocked to learn that the brain embryonic wiring mechanism involves this kind of thing where it closes a loop by sending a random code around the loop and Hebbian-learning the final connection.
I enjoy that you have an algorithm which presumes the existence of some hypothetical mechanism, whereas researchers in labs have been elucidating these mechanisms for years without any necessarily coherent vision of agentic architectures <3
I think there’s gotta be some way that it can do things like that. I feel like those kinds of feats of wiring are absolutely required for all kinds of reasons. Like, I think motor cortex connects directly to spinal hand-control nerves, but not foot-control nerves. How do the output neurons aim their paths so accurately, such that they don’t miss and connect to the foot nerves by mistake? Um, I don’t know, but it’s clearly possible. “Molecular signaling” or something, I guess?
One of the main idioms of brain wiring is basically for axon tips to do chemotaxis (often through various way stations, in sequence) and then if they find the right home base they notice and “decide” to survive, and otherwise they commit suicide and have to be cleaned up (probably to save on neural metabolic demands? and/or to reduce noise?) but then it seems like maybe there are numerous similar systems all kind of working in parallel, each with little details like the “homotopic connections” between each spot in one hemisphere and its rough cognate in the other hemisphere, through the corpus callosum?
The normal way it works, I think, is for people to get the big picture wiring diagram by simply looking, and then do biochemistry and so on, and then back their way into vague hunches about what algorithms could be consistent with such diagrams and mechanisms? You seem to be going in “algorithms first” instead :-)
Thanks!! And thanks for the wiring references! Such intricate complexity everywhere you look! Sometimes I wonder “how is there so much to say about neuroscience that we can write 50,000 neuroscience papers each year, year after year?”, and then I see stuff like this and say “Oh, that’s how.” :-P
One thing that strikes me as odd about this model is that it doesn’t have the blessing of dimensionality—each plan is one loop, and evaluating feedback to a winning plan just involves feedback to one loop. When it’s general reward we can simplify this with just rewarding recent winning plans, but in some places it seems like you do imply highly specific feedback, for which you need N feedback channels to give feedback on ~N possible plans. The “blessing of dimensionality” kicks in when you can use more diverse combinations of a smaller number of feedback channels to encode more specific feedback.
Maybe what seems to be specific feedback is actually a smaller number of general types? Like rather than specific feedback to snake-fleeing plans or whatever, a broad signal (like how Success-In-Life Reward is a general signal rewarding whatever just got planned) could be sent out that means “whatever the amygdala just did to make the snake go away, good job” (or something). Note that I have no idea what I’m talking about.
Right, so I’m saying that the “supervised learning loops” get highly specific feedback, e.g. “if you get whacked in the head, then you should have flinched a second or two ago”, “if a salty taste is in your mouth, then you should have salivated a second or two ago”, “if you just started being scared, then you should have been scared a second or two ago”, etc. etc. That’s the part that I’m saying trains the amygdala and agranular prefrontal cortex.
Then I’m suggesting that the Success-In-Life thing is a 1D reward signal to guide search in a high-dimensional space of possible thoughts to think, just like RL. In this case, it’s not “each plan is one loop”, because there’s a combinatorial explosion of possible thoughts you can think, and there are not enough loops for that. (It also wouldn’t work because for pretty much every thought you think, you’ve never thought that exact thought before—like you’ve never put on this particular jacket while humming this particular song and musing about this particular upcoming party...) Instead I think compositionality is involved, such that one plan / thought can involve many simultaneous loops.
How does the section of the amygdala that a particular dopamine neuron connects to even get trained to do the right thing in the first place? It seems like there should be enough chance in connections that there’s really only this one neuron linking a brainstem’s particular output to this specific spot in the amygdala—it doesn’t have a whole bundle of different signals available to send to this exact spot.
SL in the brain seems tricky because not only does the brainstem have to reinforce behaviors in appropriate contexts, it might have to train certain outputs to correspond to certain behaviors in the first place, all with only one wire to each location! Maybe you could do this with a single signal that means both “imitate the current behavior” and also “learn to do your behavior in this context”? Alternatively we might imagine some separate mechanism for of priming the developing amygdala to start out with a diverse yet sensible array of behavior proposals, and the brainstem could learn what its outputs correspond to and then signal them appropriately.
I’m proposing that (1) the hypothalamus has an input slot for “flinch now”, (2) VTA has an output signal for “should have flinched”, (3) there is a bundle of partially-redundant side-by-side loops (see the “probability distribution” comment) that connect specifically to both (1) and (2), by a genetically-hardcoded mechanism.
I take your comment to be saying: Wouldn’t it be hard for the brain to orchestrate such a specific pair of connections across a considerable distance?
Well, I’m very much not an expert on how the brain wires itself up. But I think there’s gotta be some way that it can do things like that. I feel like those kinds of feats of wiring are absolutely required for all kinds of reasons. Like, I think motor cortex connects directly to spinal hand-control nerves, but not foot-control nerves. How do the output neurons aim their paths so accurately, such that they don’t miss and connect to the foot nerves by mistake? Um, I don’t know, but it’s clearly possible. “Molecular signaling” or something, I guess?
Hmm, one reasonable (to me) possibility along these lines would be something like: “VTA has 20 dopamine output signals, and they’re guided to wind up spread out across the amygdala, but not with surgical precision. Meanwhile the corresponding amygdala loops terminate in an “input zone” of the lateral hypothalamus, but not to any particular spot, instead they float around unsure of exactly what hypothalamus “entry point” to connect to. And there are 20 of these intended “entry points” (collections of neurons for flinching, scowling, etc.). OK, then during embryonic development, the entry-point neurons are firing randomly, and that signal goes around the loop—within the hypothalamus and to VTA, then up to the amygdala, then back down to that floating neuron. Then Hebbian learning—i.e. matching the random code—helps the right loop neuron find its way to the matching hypothalamus entry point.”
I’m not sure if that’s exactly what you’re proposing, but that seems like a perfectly plausible way for the brain to orchestrate these connections during embryonic development. I do have a hunch that this isn’t what happens, that the real mechanism is “molecular signaling” instead. But like I said, I’m not an expert, and I certainly wouldn’t be shocked to learn that the brain embryonic wiring mechanism involves this kind of thing where it closes a loop by sending a random code around the loop and Hebbian-learning the final connection.
I enjoy that you have an algorithm which presumes the existence of some hypothetical mechanism, whereas researchers in labs have been elucidating these mechanisms for years without any necessarily coherent vision of agentic architectures <3
Its like you don’t know about keywords like “growth cone” or “chemotaxis” or attempts to visualize chemoattractant gradients!
One of the main idioms of brain wiring is basically for axon tips to do chemotaxis (often through various way stations, in sequence) and then if they find the right home base they notice and “decide” to survive, and otherwise they commit suicide and have to be cleaned up (probably to save on neural metabolic demands? and/or to reduce noise?) but then it seems like maybe there are numerous similar systems all kind of working in parallel, each with little details like the “homotopic connections” between each spot in one hemisphere and its rough cognate in the other hemisphere, through the corpus callosum?
The normal way it works, I think, is for people to get the big picture wiring diagram by simply looking, and then do biochemistry and so on, and then back their way into vague hunches about what algorithms could be consistent with such diagrams and mechanisms? You seem to be going in “algorithms first” instead :-)
Thanks!! And thanks for the wiring references! Such intricate complexity everywhere you look! Sometimes I wonder “how is there so much to say about neuroscience that we can write 50,000 neuroscience papers each year, year after year?”, and then I see stuff like this and say “Oh, that’s how.” :-P