cfoster0 comments on Alexander and Yudkowsky on AGI goals

cfoster0 24 Jan 2023 23:50 UTC
25 points
18
I was surprised when I reached this portion of the transcript. As you said, the analogous process to “how evolution happens over genomes” would be “how AI research as a field develops different approaches”. Then the analogous process to “how a human’s learning process progresses given the innate structures (such-and-such area is wired to such-and-such other area, bias to attend to faces, etc.) & learning algorithms (plasticity rules, dopamine triggers, etc.) specified by their genes” is “how an AI’s learning process progresses given the innate structures (network architectures, pretrained components, etc.) & learning algorithms (autoregressive prediction, TD-lambda, etc.) specified by their Pytorch codebase”.
See this post from Steve Byrnes as a more fleshed out case along these lines.
I was especially confused when I got to the part where Scott says
Like, we’re not going to run evolution in a way where we naturally get AI morality the same way we got human morality, but why can’t we observe how evolution implemented human morality, and then try AIs that have the same implementation design?
and Eliezer responds
Not if it’s based on anything remotely like the current paradigm, because nothing you do with a loss function and gradient descent over 100 quadrillion neurons, will result in an AI coming out the other end which looks like an evolved human with 7.5MB of brain-wiring information and a childhood.
Say what? AFAICT, the suggestion Scott was making was not that gradient descent would produce the correct 7.5MB of brain-wiring information, but rather that those 7.5MB would be contents written by us intentionally into the Pytorch repo that we plan to train the 100Q neuron network with. In the same way as we ordinarily write ourselves intentionally how many neurons are in each layer, and which parts of the network get which inputs, and what pretrained feature detectors we’re using, and which components are frozen vs. trained by loss functions 1+2 vs. trained by loss function 1 only, and which conditions trigger how much reward, and how the model samples policy rollouts etc. etc.
- Steven Byrnes 25 Jan 2023 2:53 UTC
  24 points
  11
  Parent
  Strong agree. To pile on a bit, I think I’m confused about what Eliezer is imagining when he imagines the content of those 7.5MB.
  I know what I’m imagining is in those 7.5MB: The within-lifetime learning part has several learning algorithms (and corresponding inference algorithms), neural network architectures, and (space- and time-dependent) hyperparameters. And the other part is calculating the reward function, calculating various other loss functions, and doing lots of odds and ends like regulating heart rate and executing various other innate reactions and reflexes. So for me, these are 7.5MB of more-or-less the same kinds of things that AI & ML people are used to putting into their GitHub repositories.
  By contrast, Eliezer is imagining… I’m not sure. That evolution is kinda akin to pretraining, and the 7.5MB are more-or-less specifying millions of individual weights? That I went wrong by even mentioning learning algorithms in the first place? Something else??