unless you think a ‘slight’ change in goals would produce a slight change in outcomes.
It depends what sorts of changes. Slight changes is what subgoals are included in the goal result in much larger changes in outcomes as optimization power increases, but slight changes in how much weight each subgoal is given relative to the others in the goal can even result in smaller changes in outcomes as optimization power increases if it becomes possible to come close to maxing out each subgoal at the same time. It seems plausible that one could leave the format in which goals are encoded in the brain intact while getting a significant increase in capabilities, and that this would only cause the kinds of goal changes that can lead to results that are still not too bad according to the original goal.
It depends what sorts of changes. Slight changes is what subgoals are included in the goal result in much larger changes in outcomes as optimization power increases, but slight changes in how much weight each subgoal is given relative to the others in the goal can even result in smaller changes in outcomes as optimization power increases if it becomes possible to come close to maxing out each subgoal at the same time. It seems plausible that one could leave the format in which goals are encoded in the brain intact while getting a significant increase in capabilities, and that this would only cause the kinds of goal changes that can lead to results that are still not too bad according to the original goal.
seems kind of ludicrous if we’re talking about empathy and sadism.
Most pairs of goals are not directly opposed to each other.