If the AI is capable of reflection and self-modification, it should immediately notice that it would maximize its expected utility, according to its current utility function, by modifying itself to use U’’(w) = sum of w’s utilities according to people who existed at time T0, where T0 is a constant representing the time of self-modification.
To do this it would have to be badly programmed. We start out with a time-dependent utility function U’(t). We propose to change it to U″, where U″(t) = U’(0) for all times t. But those are different functions! The utility over time of a particular future will be different for U’ and U″, and so will be the expected utility of a given action.
The expression “current utility function” is ambiguous when the utility function is time-dependent.
I agree with the above comments that concern for future individuals would be contained in the utility functions of people who exist now, but there’s an ambiguity in the AI’s utility function in that it seems forbidden to consider the future or past output of it’s utility function. By limiting itself to the concern of the people who currently exist, if it were to try and maximize this output over all time it would then be concerning itself with people who do not yet or no longer exist, which is at direct odds with its utility function. Being barred from such considerations, it could make sense to change it’s own utility function to restrict concern to the people existing at that tame, IF this is what most satisfied the preference of those people.
While the default near-sightedness of people is bad news here, if the AI succeeds in modelling us as “smarter, more the people we want to be” etc, then its utility function seems unlikely to become so fixed in time.
To do this it would have to be badly programmed. We start out with a time-dependent utility function U’(t). We propose to change it to U″, where U″(t) = U’(0) for all times t. But those are different functions! The utility over time of a particular future will be different for U’ and U″, and so will be the expected utility of a given action.
The expression “current utility function” is ambiguous when the utility function is time-dependent.
I agree with the above comments that concern for future individuals would be contained in the utility functions of people who exist now, but there’s an ambiguity in the AI’s utility function in that it seems forbidden to consider the future or past output of it’s utility function. By limiting itself to the concern of the people who currently exist, if it were to try and maximize this output over all time it would then be concerning itself with people who do not yet or no longer exist, which is at direct odds with its utility function. Being barred from such considerations, it could make sense to change it’s own utility function to restrict concern to the people existing at that tame, IF this is what most satisfied the preference of those people.
While the default near-sightedness of people is bad news here, if the AI succeeds in modelling us as “smarter, more the people we want to be” etc, then its utility function seems unlikely to become so fixed in time.