Charlie Steiner comments on Google DeepMind’s RT-2

Charlie Steiner 12 Aug 2023 9:08 UTC
2 points
0
Wild. If I’m reading the paper right, this uses the same dataset as RT-1 to ground the finetuning of the robot-commanding tokens, they just get to fine-tune an off the shelf multimodal transformer rather than having to make a custom solution (as in RT-1), and it works better.