Even if our current utility functions have some sort of “objective” measurement, can we still be confident that it can be objectively extrapolated? What if the first thing I want to do with the capacity for self-modification is self-modify myself to have the cognitive utility function unew(observations) = 100*uold(observations)? That doesn’t directly change my own decisions at all (assuming it doesn’t burn out my brain), but it suddenly gives me many more “votes” in any collective decision-making.
Even if our current utility functions have some sort of “objective” measurement, can we still be confident that it can be objectively extrapolated? What if the first thing I want to do with the capacity for self-modification is self-modify myself to have the cognitive utility function unew(observations) = 100*uold(observations)? That doesn’t directly change my own decisions at all (assuming it doesn’t burn out my brain), but it suddenly gives me many more “votes” in any collective decision-making.