gwern comments on “A Generalist Agent”: New DeepMind Publication

gwern 12 May 2022 16:40 UTC
15 points

We focus our training at the operating point of model scale that allows real-time control of real-world robots, currently around 1.2B parameters in the case of Gato. As hardware and model architectures improve, this operating point will naturally increase the feasible model size, pushing generalist models higher up the scaling law curve. For simplicity Gato was trained offline in a purely supervised manner; however, in principle, there is no reason it could not also be trained with either offline or online reinforcement learning (RL).

And there is, of course, absolutely no reason to think that it wouldn’t get as good as text/image models like Flamingo or the new ULM2 if it was trained & scaled as much as they were; the problem is that you can’t run such large dense models at the necessary low latency for realtime robotics… Perhaps finally a genuine application for MoEs to enable plugging in very large unimodal/multimodal models.
- p.b. 12 May 2022 17:46 UTC
  3 points
  Parent
  A principled solution would probably involve running different parts of the model at different frequencies. But you could also just scale breadth and see how far it goes. The human brain is not very deep—just recursive.
  - Pattern 12 May 2022 18:46 UTC
    1 point
    Parent
    I wouldn’t have connected breadth and recursion. (I’d have just thought, well, self-calling.)
- Aryeh Englander 12 May 2022 21:36 UTC
  2 points
  Parent
  A friend pointed out on Facebook that Gato uses TPU-v3′s. Not sure why—I thought Google already had v4′s available for internal use a while ago? In any case, the TPU-v4 might potentially help a lot for the latency issue.
  - Qumeric 13 May 2022 7:26 UTC
    6 points
    Parent
    Two main options:
    * It was trained e.g. 1 year ago but published only now
    * All TPU-v4 very busy with something even more important
  - lennart 13 May 2022 11:01 UTC
    4 points
    Parent
    They trained it on TPUv3s, however, the robot inference was run on a Geforce RTX 3090 (see section G).
    TPUs are mostly designed for data centers and are not really usable for on-device inference.