I don’t think the confusions are that hard to resolve, although related confusions might be. Here are some distinct questions:
Will a given AI’s creation lead to good consequences?
To what extent can a given AI be said to have a utility function?
How can we define humanity’s utility function?
How closely does a given AI’s utility function approximate our definition?
Is a given AI’s utility function stable?
The standard SI position would be something like an AI will only lead to good consequences if we are careful to define humanity’s utility function, get the AI to approximate it extremely closely, and ensure the AI’s utility function is stable, or only moves towards being a better approximation of humanity’s utility function. (I don’t see how that last one could reliably be expected to happen.)
I don’t think the confusions are that hard to resolve, although related confusions might be. Here are some distinct questions:
Will a given AI’s creation lead to good consequences?
To what extent can a given AI be said to have a utility function?
How can we define humanity’s utility function?
How closely does a given AI’s utility function approximate our definition?
Is a given AI’s utility function stable?
The standard SI position would be something like an AI will only lead to good consequences if we are careful to define humanity’s utility function, get the AI to approximate it extremely closely, and ensure the AI’s utility function is stable, or only moves towards being a better approximation of humanity’s utility function. (I don’t see how that last one could reliably be expected to happen.)