Daniel Kokotajlo comments on Deepmind’s Gato: Generalist Agent

Daniel Kokotajlo 20 May 2022 16:06 UTC
4 points
0
Maybe I misinterpreted you and/or her sorry. I guess I was eyeballing Ajeya’s final distribution and seeing how much of it is above the genome anchor / medium horizon anchor, and thinking that when someone says “we literally could scale up 2020 algorithms and get TAI” they are imagining something less expensive than that (since arguably medium/genome and above, especially evolution, represents doing a search for algorithms rather than scaling up an existing algorithm, and also takes such a ridiculously large amount of compute that it’s weird to say we “could” scale up to it.) So I was thinking that probability mass in “yes we could literally scale existing algorithms” is probability mass below +12 OOMs basically. Wheras Ajeya is at 50% by +12. I see I was probably misunderstanding you; you meant scaling up existing algorithms to include stuff like genome and long-horizon anchor? But you agree it doesn’t include evolution, right?)
- Rohin Shah 20 May 2022 18:49 UTC
  4 points
  0
  Parent
  All of the short-horizon, medium-horizon, or long-horizon paths would count as “scaling up 2020 algorithms”.
  I mostly ignore the genome anchor (see “Ignoring the genome anchor” in my opinion).
  I’m not entirely sure how you’re imagining redoing evolution. If you’re redoing it by creating a multiagent environment simulation, with the agents implemented via neural networks updated using some form of gradient descent, I think that’s “scaling up 2020 algorithms”.
  If you instead imagine having a long string of parameters (analogous to DNA) that tells you how to build a brain for the agent, and then learning involves making a random change to the long string of parameters and seeing how that goes, and keeping it if it’s good—I agree that’s not “scaling up 2020 algorithms”.
  thinking that when someone says “we literally could scale up 2020 algorithms and get TAI” they are imagining something less expensive than that
  I just literally mean “there is some obscene amount of compute, such that if you use that much compute with 2020 algorithms, and you did some engineering to make sure you could use that compute effectively (things more like hyperparameter tuning and less like inventing Transformers), and you got the data that was needed (who knows what that is), then you get TAI”. That’s the belief that makes you take bio anchors more seriously. Pre-bio-anchors, it would have been hard for me to give you a specific number for the obscene compute that would be needed.
  - Daniel Kokotajlo 20 May 2022 18:52 UTC
    2 points
    0
    Parent
    Right, OK.
    
    Pre bio-anchors couldn’t you have at least thought that recapitulating evolution would be enough? Or are you counting that as part of the bio anchors framework?
    - Rohin Shah 21 May 2022 7:55 UTC
      2 points
      0
      Parent
      What exactly does “recapitulating evolution” mean? If you mean simulating our laws of physics in an initial state that is as big as the actual world and includes, say, a perfect simulation of bacteria, and then letting the simulation evolve for the equivalent of billions of years until some parts of the environment implement general intelligence, then sure, that would be enough, but also that’s way way more compute than the evolution anchor (and also we don’t have the knowledge to set up the initial state right). (You could even then be worried about anthropic arguments saying that this won’t work.)
      If you instead mean that we have some simulated environment that we hope resembles the ancestral environment, and we put in simulated animal bodies with a neural network to control them, and then train those neural networks with current gradient descent or evolutionary algorithms, I would not then and do not now think that such an approach is clearly going to produce TAI given evolutionary anchor levels of compute.