gwern comments on “A Generalist Agent”: New DeepMind Publication

gwern May 12, 2022, 5:45 PM
22 points

simulate what would happen if it does various different things

It is a generative Transformer trained offline to predict tokens. Why can’t it?
- tailcalled May 12, 2022, 5:48 PM
  3 points
  Parent
  Well, it could learn to do it. But that’d be like a human doing math to predict how a system works, rather than a human intuiting how a system works. Massive difference in speed means some other algorithm would probably go AGI first?
  While I don’t dispute that it could learn to do it, the current trained model cannot do this.
  - gwern May 12, 2022, 6:11 PM
    23 points
    Parent
    
    Well, it could learn to do it.
    
    I mean, in what sense has a Decision Transformer like Gato not already learned to do it by extensive 1-step prediction?
    
    I mean for one, its architecture does not permit its weights to change without receiving training data, and it does not generate training data itself.
    
    As we know perfectly well by now, frozen weights do not preclude runtime learning, and Gato is trained on meta-learning tasks (MetaWorld and Procgen, plus the real-world datasets which are longtailed and elicit meta-learning in GPT-3 etc). They also mention adding Transformer-XL recurrent memory at runtime.
    - tailcalled May 12, 2022, 9:46 PM
      5 points
      Parent
      I mean, in what sense has a Decision Transformer like Gato not already learned to do it by extensive 1-step prediction?
      I don’t think Gato does the sort of training-in-simulation that Dreamer does. And that training-in-simulation seems like a major part of intelligence. So I think Dreamer has a component needed^[1] for AGI that Gato lacks.
      As we know perfectly well by now, frozen weights do not preclude runtime learning, and Gato is trained on meta-learning tasks (MetaWorld and Procgen, plus the real-world datasets which are longtailed and elicit meta-learning in GPT-3 etc). They also mention adding Transformer-XL recurrent memory at runtime.
      Gato supports a sequence length of only 1048, which means that it cannot remember its meta-”learned” things for very long. Non-frozen weights would eliminate that problem.
      ^
      Well, “needed”, you could perhaps brute-force your way to a solution to AGI without this component, but then the problem is that Gato does not have enough dakka to be generally intelligent.
    - Yitz May 12, 2022, 6:43 PM
      5 points
      Parent
      I’m also a bit concerned we may be moving the goalposts here a bit. Not sure if there’s a clear way to quantify how that’s being done, just a general impression I’m getting
      - tailcalled May 12, 2022, 10:02 PM
        5 points
        Parent
        I don’t agree that I’m moving the goalposts, these were the sorts of ingredients I was thinking about before seeing Gato, as I was inspired by e.g. Dreamer.