Aren’t they just averaging together to yield yet another somewhat-but-not-quite-right function?
Indeed we don’t want such linear behavior. The AI should preserve the potential for maximization of any candidate utility function—first so it has time to acquire all the environment’s evidence about the utility function, and then for the hypothetical future scenario of us deciding to shut it off.
Why not just tell the AI the truth? Which, in this case, is: Although we might not be able to give it useful information to differentiate between certain complex candidate hypotheses at this point in time, as we reflect and enhance our intelligence, this will become possible. This process of us reflecting & enhancing our intelligence will take an eyeblink in cosmic time. The amount of time from now until the heat death of the universe is so large that instead of maximizing EU according to a narrow conception of our values in the short term, it’s better for the AI’s actions to remain compatible with a broad swath of potential values that we might discover are the correct values on reflection.
Indeed we don’t want such linear behavior. The AI should preserve the potential for maximization of any candidate utility function—first so it has time to acquire all the environment’s evidence about the utility function, and then for the hypothetical future scenario of us deciding to shut it off.
See this comment. Stuart and I are discussing what happens after things have converged as much as they’re going to, but there’s still uncertainty left.
Why not just tell the AI the truth? Which, in this case, is: Although we might not be able to give it useful information to differentiate between certain complex candidate hypotheses at this point in time, as we reflect and enhance our intelligence, this will become possible. This process of us reflecting & enhancing our intelligence will take an eyeblink in cosmic time. The amount of time from now until the heat death of the universe is so large that instead of maximizing EU according to a narrow conception of our values in the short term, it’s better for the AI’s actions to remain compatible with a broad swath of potential values that we might discover are the correct values on reflection.