Firstly, humans are unable to self modify to the degree that an AGI will be able to. It is not clear to me that a human given the chance to self modify wouldn’t immediately wirehead. An AGI may require a higher degree of alignment than what individual humans demonstrate.
Second, it is surely worth noting that humans aren’t particularly aligned to their own happiness or avoiding suffering when the consequences of their action are obscured by time and place.
In the developed world humans make dietary decisions that lead to horrific treatment of animals, despite most humans not being willing to torture and animal themselves.
It also appears quite easy for the environment to trick individual humans into making decisions that increase their suffering in the long term for apparent short term pleasure. A drug addict is the obvious example, but who among us can say they haven’t wasted hours of their lives browsing the internet etc.
Two points.
Firstly, humans are unable to self modify to the degree that an AGI will be able to. It is not clear to me that a human given the chance to self modify wouldn’t immediately wirehead. An AGI may require a higher degree of alignment than what individual humans demonstrate.
Second, it is surely worth noting that humans aren’t particularly aligned to their own happiness or avoiding suffering when the consequences of their action are obscured by time and place.
In the developed world humans make dietary decisions that lead to horrific treatment of animals, despite most humans not being willing to torture and animal themselves.
It also appears quite easy for the environment to trick individual humans into making decisions that increase their suffering in the long term for apparent short term pleasure. A drug addict is the obvious example, but who among us can say they haven’t wasted hours of their lives browsing the internet etc.