If there is a rule that says ‘optimize X for X seconds’ why would an AGI make a difference between ‘optimize X’ and ‘for X seconds’? In other words, why is it assumed that we can succeed to create a paperclip maximizer that cares strongly enough about the design parameters of paperclips to consume the universe (why would it do that as long as it isn’t told to do so) but somehow ignores all design parameters that have to do with spatio-temporal scope boundaries or resource limitations?
The first problem associated with switching such an agent off is specifying exactly what needs to be switched off to count as the agent being being in an “off” state. This is the problem of the agent’s identity. Humans have an intuitive sense of their own identity, and the concept usually deliniates a fleshy sack surrounded by skin. However, phenotypes extend beyond that—as Richard Dawkins pointed out in his book, The Extended Phenotype.
For a machine intelligence, the problem is a thorny one. Machines may construct other machines, and set these to work. They may sub-contract their activities to other agents. Telling a machine to turn itself off and then being faced with an army of its minions and hired help still keen to perform the machine’s original task is an example of how this problem might manifest istelf.
I discuss the associated problems here: