Rodrigo Heck comments on What will the scaled up GATO look like? (Updated with questions)

Rodrigo Heck 27 Oct 2022 6:08 UTC
1 point
0
My guess is they will make tweaks on the tokenization part. If I was given the task of adding more modalities to a Transformer, I would probably be scretching my head thinking if there was a way to universally tokenize any type of data. But that’s an optimistic scenario, I don’t expect them to come up with such a solution right now. So I think they will just add new state-of-the-art tokenizers, like ViT-VQGAN for images. Other than that, I am mostly curious if we will be able to more clearly observe transfer learning when increasing the model size. That I think is the most important information that can come from GATO 2, because then we will be more equiped to achieve Chincilla’s scaling laws without some potential slowdown. I am betting that we will, as long as tasks are not that far away from each other and we take the time to build good tokenizers.