I think this is a little bit off. The world doesn’t have a True Distribution, it’s just the world. A more careful treatment would involve talking about why we expect Solomonoff induction to work well, why the speed prior (as in universal search prior) also works in theory, and what you think might be different in practice (e.g. if you’re actually constructing a program with gradient descent using something like “description length” or “runtime” as a loss).
I think this is a little bit off. The world doesn’t have a True Distribution, it’s just the world. A more careful treatment would involve talking about why we expect Solomonoff induction to work well, why the speed prior (as in universal search prior) also works in theory, and what you think might be different in practice (e.g. if you’re actually constructing a program with gradient descent using something like “description length” or “runtime” as a loss).