gwern comments on “A Generalist Agent”: New DeepMind Publication

gwern May 12, 2022, 7:31 PM
20 points
And vice-versa: transfer Gato to the new task, and finetune and sparsify/distill (eg turn the Transformer into a RNN, or do training with Transformer-XL instead of just runtime) when a task becomes common enough to justify the amortized expense.