Sorry, I think that particular sentence of mine was poorly written (and got appropriate pushback at the time). I still endorse my followup comment, which includes this clarification:
The thing I’m not that interested in (from a “how scared should we be” or “timelines” perspective) is when you take a bunch of different tasks, shove them into a single “generic agent”, and the resulting agent is worse on most of the tasks and isn’t correspondingly better at some new task that none of the previous systems could do.
In particular, my impression with Gato is that it was not showing much synergy. I agree that synergy is possible and likely to increase with additional scale (and I’m pretty sure I would have said so at the time, especially since I cited a different example of positive transfer).
(Note I haven’t read the mixed-modal scaling laws paper in detail so I may be missing an important point about it.)
Sorry, I think that particular sentence of mine was poorly written (and got appropriate pushback at the time). I still endorse my followup comment, which includes this clarification:
In particular, my impression with Gato is that it was not showing much synergy. I agree that synergy is possible and likely to increase with additional scale (and I’m pretty sure I would have said so at the time, especially since I cited a different example of positive transfer).
(Note I haven’t read the mixed-modal scaling laws paper in detail so I may be missing an important point about it.)