Thanks for cross-posting this! Sorry I didn’t get around to responding originally. :-)
E.g. the thing RL currently does, which I don’t expect the inner algorithm to be able to do, is make the first three layers of the network vision layers, and then a big region over on the other side the language submodule, and so on. And eventually I expect RL to shape the way the inner algorithm does weight updates, via meta-learning.
For what it’s worth, I figure that the neocortex has some number (dozens to hundreds, maybe 180 like your link says, I dunno) of subregions that do a task vaguely like “predict data X from context Y”, with different X & Y & hyperparameters in different subregions. So some design work is obviously required to make those connections. (Some taste of what that might look like in more detail is maybe Randall O’Reilly’s vision-learning model.) I figure this is vaguely analogous to figuring out what convolution kernel sizes and strides you need in a ConvNet, and that specifying all this is maybe hundreds or low thousands but not millions of bits of information. (I don’t really know right now, I’m just guessing.) Where will those bits of information come from? I figure, some combination of:
automated neural architecture search
and/or people looking at the neuroanatomy literature and trying to copy ideas
and/or when the working principles of the algorithm are better understood, maybe people can just guess what architectures are reasonable, just like somebody invented U-Nets by presumably just sitting and thinking about what’s a reasonable architecture for image segmentation, followed by some trial-and-error tweaking.
and/or some kind of dynamic architecture that searches for learnable relationships and makes those connections on the fly … I imagine a computer would be able to do that to a much greater extent than a brain (where signals travel slowly, new long-range high-bandwidth connections are expensive, etc.)
If I understand your comment correctly, we might actually agree on the plausibility of the brute force “automated neural architecture search” / meta-learning case. …Except for the terminology! I’m not calling it “evolution analogy” because the final learning algorithm is mainly (in terms of information content) human-designed and by-and-large human-legible. Like, maybe humans won’t have a great story for why the learning rate is 1.85 in region 72 but only 1.24 in region 13...But they’ll have the main story of the mechanics of the algorithm and why it learns things. (You can correct me if I’m wrong.)
Thanks for cross-posting this! Sorry I didn’t get around to responding originally. :-)
For what it’s worth, I figure that the neocortex has some number (dozens to hundreds, maybe 180 like your link says, I dunno) of subregions that do a task vaguely like “predict data X from context Y”, with different X & Y & hyperparameters in different subregions. So some design work is obviously required to make those connections. (Some taste of what that might look like in more detail is maybe Randall O’Reilly’s vision-learning model.) I figure this is vaguely analogous to figuring out what convolution kernel sizes and strides you need in a ConvNet, and that specifying all this is maybe hundreds or low thousands but not millions of bits of information. (I don’t really know right now, I’m just guessing.) Where will those bits of information come from? I figure, some combination of:
automated neural architecture search
and/or people looking at the neuroanatomy literature and trying to copy ideas
and/or when the working principles of the algorithm are better understood, maybe people can just guess what architectures are reasonable, just like somebody invented U-Nets by presumably just sitting and thinking about what’s a reasonable architecture for image segmentation, followed by some trial-and-error tweaking.
and/or some kind of dynamic architecture that searches for learnable relationships and makes those connections on the fly … I imagine a computer would be able to do that to a much greater extent than a brain (where signals travel slowly, new long-range high-bandwidth connections are expensive, etc.)
If I understand your comment correctly, we might actually agree on the plausibility of the brute force “automated neural architecture search” / meta-learning case. …Except for the terminology! I’m not calling it “evolution analogy” because the final learning algorithm is mainly (in terms of information content) human-designed and by-and-large human-legible. Like, maybe humans won’t have a great story for why the learning rate is 1.85 in region 72 but only 1.24 in region 13...But they’ll have the main story of the mechanics of the algorithm and why it learns things. (You can correct me if I’m wrong.)