Richard_Ngo comments on Against evolution as an analogy for how humans will create AGI

Richard_Ngo Mar 25, 2021, 1:23 PM
LW: 7 AF: 5
AF
there’s a “solving the problem twice” issue. As mentioned above, in Case 5 we need both the outer and the inner algorithm to be able to do open-ended construction of an ever-better understanding of the world—i.e., we need to solve the core problem of AGI twice with two totally different algorithms! (The first is a human-programmed learning algorithm, perhaps SGD, while the second is an incomprehensible-to-humans learning algorithm. The first stores information in weights, while the second stores information in activations, assuming a GPT-like architecture.)
Cross-posting a (slightly updated) comment I left on a draft of this document:
I suspect that this is indexed too closely to what current neural networks look like. I see no good reason why the inner algorithm won’t eventually be able to change the weights as well, as in human brains. (In fact, this might be a crux for me—I agree that the inner algorithm having no ability to edit the weights seems far-fetched).
So then you might say that we’ve introduced a disanalogy to evolution, because humans can’t edit our genome.

But the key reason I think that RL is roughly analogous to evolution is because it shapes the high-level internal structure of a neural network in roughly the same way that evolution shapes the high-level internal structure of the human brain, not because there’s a totally strict distinction between levels.

E.g. the thing RL currently does, which I don’t expect the inner algorithm to be able to do, is make the first three layers of the network vision layers, and then a big region over on the other side the language submodule, and so on. And eventually I expect RL to shape the way the inner algorithm does weight updates, via meta-learning.
You seem to expect that humans will be responsible for this sort of high-level design. I can see the case for that, and maybe humans will put in some modular structure, but the trend has been pushing the other way. And even if humans encode a few big modules (analogous to, say, the distinction between the neocortex and the subcortext), I expect there to be much more complexity in how those actually work which is determined by the outer algorithm (analogous to the hundreds of regions which appear across most human brains).
- Steven Byrnes Mar 29, 2021, 12:59 PM
  LW: 2 AF: 1
  AF Parent
  Thanks for cross-posting this! Sorry I didn’t get around to responding originally. :-)
  E.g. the thing RL currently does, which I don’t expect the inner algorithm to be able to do, is make the first three layers of the network vision layers, and then a big region over on the other side the language submodule, and so on. And eventually I expect RL to shape the way the inner algorithm does weight updates, via meta-learning.
  For what it’s worth, I figure that the neocortex has some number (dozens to hundreds, maybe 180 like your link says, I dunno) of subregions that do a task vaguely like “predict data X from context Y”, with different X & Y & hyperparameters in different subregions. So some design work is obviously required to make those connections. (Some taste of what that might look like in more detail is maybe Randall O’Reilly’s vision-learning model.) I figure this is vaguely analogous to figuring out what convolution kernel sizes and strides you need in a ConvNet, and that specifying all this is maybe hundreds or low thousands but not millions of bits of information. (I don’t really know right now, I’m just guessing.) Where will those bits of information come from? I figure, some combination of:
  - automated neural architecture search
  - and/or people looking at the neuroanatomy literature and trying to copy ideas
  - and/or when the working principles of the algorithm are better understood, maybe people can just guess what architectures are reasonable, just like somebody invented U-Nets by presumably just sitting and thinking about what’s a reasonable architecture for image segmentation, followed by some trial-and-error tweaking.
  - and/or some kind of dynamic architecture that searches for learnable relationships and makes those connections on the fly … I imagine a computer would be able to do that to a much greater extent than a brain (where signals travel slowly, new long-range high-bandwidth connections are expensive, etc.)
  If I understand your comment correctly, we might actually agree on the plausibility of the brute force “automated neural architecture search” / meta-learning case. …Except for the terminology! I’m not calling it “evolution analogy” because the final learning algorithm is mainly (in terms of information content) human-designed and by-and-large human-legible. Like, maybe humans won’t have a great story for why the learning rate is 1.85 in region 72 but only 1.24 in region 13...But they’ll have the main story of the mechanics of the algorithm and why it learns things. (You can correct me if I’m wrong.)