I know these approaches and they don’t work. Maybe they will start working at some point, but to me very unclear when and why that should happen. All approaches that use recurrence based on token-output are fundamentally impoverished compared to real recurrence.
About multi-modality:
I expect these limitations to largely vanish as models are scaled up and trained end-to-end on a large variety of modalities.
Yes, maybe Gemini will be able to really hear and see.
I know these approaches and they don’t work. Maybe they will start working at some point, but to me very unclear when and why that should happen. All approaches that use recurrence based on token-output are fundamentally impoverished compared to real recurrence.
About multi-modality:
Yes, maybe Gemini will be able to really hear and see.