the simpler the utility function the easier time it has guaranteeing the alignment of the improved version
If we are talking about a theoretical argmaxaE(U|a) AI, where E(U|a) (expectation of utility given the action a) somehow points to the external world, then sure. If we are talking about a real AI with aspiration to become the physical embodiment of the aforementioned theoretical concept (with the said aspiration somehow encoded outside of U, because U is simple), then things get more hairy.
If we are talking about a theoretical argmaxaE(U|a) AI, where E(U|a) (expectation of utility given the action a) somehow points to the external world, then sure. If we are talking about a real AI with aspiration to become the physical embodiment of the aforementioned theoretical concept (with the said aspiration somehow encoded outside of U, because U is simple), then things get more hairy.