It seems to be able to understand video rather than just images from the demos, I’d assume that will give it much better time understanding too. (Gemini also has video input)
Are you saying this because temporal understanding is necessary for audio? Are there any tests that could be done with just the text interface to see if it understands time better? I can’t really think of any (besides just doing off vibes after a bunch of interaction).
I imagine its music skills are a good bit stronger. it’s more of a statement of curiosity regarding longer term time reasoning, like on the scale of hours to days.
this means it knows about time in a much deeper sense than previous large public models. I wonder how far that goes.
Gemini also supported audio natively.
oh, interesting, okay. I certainly didn’t notice any strong effect like this when talking to gemini previously.
It seems to be able to understand video rather than just images from the demos, I’d assume that will give it much better time understanding too. (Gemini also has video input)
Are you saying this because temporal understanding is necessary for audio? Are there any tests that could be done with just the text interface to see if it understands time better? I can’t really think of any (besides just doing off vibes after a bunch of interaction).
I imagine its music skills are a good bit stronger. it’s more of a statement of curiosity regarding longer term time reasoning, like on the scale of hours to days.