An AI can only “want” to “refine/improve” its goals if that “desire to change goals” is itself included in the goals. It is not the actual highest-level goals that change. There would have to be a “have an evolving definition of happy that may evolve in the following ways”-meta goal, otherwise you get a logical error: The AI having the goal X1 to change its goals X2, without X1 being part of its goals! Do you see the reductio?
The way my brain works is not in any meaningful sense part of my terminal goals. My visual cortex does not work the way it does due to some goal X1 (if we don’t want to resort to natural selection and goals external to brains).
A superhuman general intelligence will be generally intelligent without that being part of its utility-function, or otherwise you might as well define all of the code to be the utility-function.
What I am claiming, in your parlance, is that acting intelligently is X1 and will be part of any AI by default. I am further saying that if an AI was programmed to be generally intelligent then it would have to be programmed to be selectively stupid in order fail at doing what it was meant to do while acting generally intelligent at doing what it was not meant to do.
The way my brain works is not in any meaningful sense part of my terminal goals. My visual cortex does not work the way it does due to some goal X1 (if we don’t want to resort to natural selection and goals external to brains).
A superhuman general intelligence will be generally intelligent without that being part of its utility-function, or otherwise you might as well define all of the code to be the utility-function.
What I am claiming, in your parlance, is that acting intelligently is X1 and will be part of any AI by default. I am further saying that if an AI was programmed to be generally intelligent then it would have to be programmed to be selectively stupid in order fail at doing what it was meant to do while acting generally intelligent at doing what it was not meant to do.