The usefulness of a model like Sora is to generate plausible scenarios a robotic system might experience, amplifying your real world data.
“Folk physics” are fine because model makes testable predictions starting from each frame of the real world, and you can continually train to predict ground truth.
Over enough data and compute, “folk physics” are real physics within the domains a robot is able to observe.
The Tesla case is video to video, a modality Sora supports.
Text to video is useful for testing robotic classifiers.
https://x.com/elonmusk/status/1758970943840395647?s=20
The usefulness of a model like Sora is to generate plausible scenarios a robotic system might experience, amplifying your real world data.
“Folk physics” are fine because model makes testable predictions starting from each frame of the real world, and you can continually train to predict ground truth.
Over enough data and compute, “folk physics” are real physics within the domains a robot is able to observe.
The Tesla case is video to video, a modality Sora supports.
Text to video is useful for testing robotic classifiers.