IIUC this addresses the ontology problem in AIXI by assuming that the domain of our utility function already covers every possible computable ontology, so that whichever one turns out to be correct, we already know what to do with it. If I take the AIXI formalism to be a literal description of the universe (i.e. not just dualism, but also that the AI is running on a hypercomputer, the environment is running on a turing computer, and the utility function cares only about the environment, not the AI’s internals), then I think the proposal works.
But under the more reasonable assumption of an environment at least as computationally complex as the AI itself (whether AIXI in an uncomputable world or AIXI-tl in an exptime world or whatever), Solomonoff Induction still beats any computable sensory prediction, but the proposed method of dealing with utility functions fails: The Solomonoff mixture doesn’t contain a simulation of the true state of the environment, so the utility function is never evaluated with the true state of the environment as input. It’s evaluated on various approximations of the environment, but those approximations are selected so as to correctly predict sensory inputs, not to correctly predict utilities. If the proposal was supposed to work for arbitrary utility functions, then there’s no reason to expect those two approximation problems to be at all related even before Goodhart’s law comes into the picture; or if you intended to assume some smoothness-under-approximation constraint on the utility function, then that’s exactly the part of the problem that you handwaved.
IIUC this addresses the ontology problem in AIXI by assuming that the domain of our utility function already covers every possible computable ontology, so that whichever one turns out to be correct, we already know what to do with it. If I take the AIXI formalism to be a literal description of the universe (i.e. not just dualism, but also that the AI is running on a hypercomputer, the environment is running on a turing computer, and the utility function cares only about the environment, not the AI’s internals), then I think the proposal works.
But under the more reasonable assumption of an environment at least as computationally complex as the AI itself (whether AIXI in an uncomputable world or AIXI-tl in an exptime world or whatever), Solomonoff Induction still beats any computable sensory prediction, but the proposed method of dealing with utility functions fails: The Solomonoff mixture doesn’t contain a simulation of the true state of the environment, so the utility function is never evaluated with the true state of the environment as input. It’s evaluated on various approximations of the environment, but those approximations are selected so as to correctly predict sensory inputs, not to correctly predict utilities. If the proposal was supposed to work for arbitrary utility functions, then there’s no reason to expect those two approximation problems to be at all related even before Goodhart’s law comes into the picture; or if you intended to assume some smoothness-under-approximation constraint on the utility function, then that’s exactly the part of the problem that you handwaved.