Worrying that all superintelligences will tend to wirehead seems similar to worrying that Gandhi would take a pill that would make him stop caring about helping people and be happy about everything, if such a pill were offered to him.
A reward-signal-maximizing AI would indeed tend to wirehead if it gets smart enough to be considering self-modifications, because at that point it will be more of an optimization agent whose utility function is based on the value of its reward signal and nothing else, but that doesn’t mean we can’t make optimization agents with less-simplistic utility functions.
Worrying that all superintelligences will tend to wirehead seems similar to worrying that Gandhi would take a pill that would make him stop caring about helping people and be happy about everything, if such a pill were offered to him.
A reward-signal-maximizing AI would indeed tend to wirehead if it gets smart enough to be considering self-modifications, because at that point it will be more of an optimization agent whose utility function is based on the value of its reward signal and nothing else, but that doesn’t mean we can’t make optimization agents with less-simplistic utility functions.