lorepieri comments on Gato’s Generalisation: Predictions and Experiments I’d Like to See

lorepieri 18 May 2022 11:48 UTC
5 points
Fair analysis, I agree with the conclusions. The main contribution seems to be a proof that transformers can handle many tasks at the same time.
Not sure if you sorted the tests in order of relevance, but I also consider the “held-out” test as being the more revealing. Besides finetuning, it would be interesting to test the zero-shot capabilities.
- Oliver Sourbut 18 May 2022 16:41 UTC
  3 points
  Parent
  I didn’t methodically order the experiment ideas, but they are meant to be roughly presented in order of some combination of concreteness/tractability and importance.
  
  What do you think of my speculation about the tagging/switching/routing internal mechanism?
  - lorepieri 18 May 2022 21:02 UTC
    2 points
    Parent
    When you say “switching” it reminds me of the “big switch” approach of https://en.wikipedia.org/wiki/General_Problem_Solver.
    Regarding to how they do it, I believe the relevant passage to be:
    Because distinct tasks within a domain can share identical embodiments, observation formats and action specifications, the model sometimes needs further context to disambiguate tasks. Rather than providing e.g. one-hot task identifiers, we instead take inspiration from (Brown et al., 2020; Sanh et al., 2022; Wei et al., 2021) and use prompt conditioning.
    I guess it should be possible to locate the activation paths for different tasks, as the tasks are pretty well separated. Something on the lines of https://github.com/jalammar/ecco