By this reasoning, an AI asked to do anything at all would respond by immediately modifying itself to set its utility function to MAXINT.
That’s allegedly more or less what happened to Eurisko (here, section 2), although it didn’t trick itself quite that cleanly. The problem was only solved by algorithmically walling off its utility function from self-modification: an option that wouldn’t work for sufficiently strong AI, and one to avoid if you want to eventually allow your AI the capacity for a more precise notion of utility than you can give it.
Paperclipping as the term’s used here assumes value stability.
That’s allegedly more or less what happened to Eurisko (here, section 2), although it didn’t trick itself quite that cleanly. The problem was only solved by algorithmically walling off its utility function from self-modification: an option that wouldn’t work for sufficiently strong AI, and one to avoid if you want to eventually allow your AI the capacity for a more precise notion of utility than you can give it.
Paperclipping as the term’s used here assumes value stability.