For what it’s worth, though I can’t point to specific predictions I was not at all surprised by multi-modality. It’s still a token prediction problem, there are not fundamental theoretical differences. I think that modestly more insights are necessary for these other problems.
For what it’s worth, though I can’t point to specific predictions I was not at all surprised by multi-modality. It’s still a token prediction problem, there are not fundamental theoretical differences. I think that modestly more insights are necessary for these other problems.