What will the scaled up GATO look like? (Updated with questions)

Demis Hassabis mentioned a few months ago that Deepmind is in the middle of scaling up its generalist agent GATO. Based on this, I would except it to come out either by the end of the year or in the beginning of the next year. The original model had just 1 billion parameters and was able to play atari games, operate a robotic arm and be used as a (relativel small) language model, so scaling it up to several hundred billion(trillions?) parameters seems to have a great potential in capability progress.

There were similar discussions about what would the most impressive thing that GPT-4 would be capable of, and similarly what it won’t be able to achieve. However, since GPT-4 is likely going to be limited to text, it is probable that scaled up GATO would exhibit an entirely new/different set of capabilities.

So let’s discuss the following questions: What will the scaled up GATO model/data/training look like? What will it be capable of and what would be the most impressive capabilities? On the other hand, what are the things it won’t achieve?

Update:

To get more engagement with the post I provide some more specific questions to make predictions on (Thanks Daniel Kokotajlo for the idea):

How many parameters will the model have? (Current Gato has size was 1 billion)

How large will the context window be?

Will we see some more addvanced algorithmic improvements? E.g. new type of long-term memory, actual RL and similar

What type of data would it be trained on?

Will we observe transfer learning? e.g. seeing improvements in language model after training on audio

Will GATO see bigger improvements from Chain-of-thought style prompting compared to a LLM of similar size?

Will it be able to play new Atari games without being trained on them?

Feel free to suggest more questions and I will add them.