I don’t pretend that stopping is simple. However, it is one of the simplest things that a machine can do—I figure if we can make machines do anything, we can make them do that.
Re: “If the AI knows it will undergo this transformation in the future, it would erase its own knowledge of the minions it has created, and do other things to ensure that it will be powerless when its utility function changes.”
No, not if it wants to stop, it won’t. That would mean that it did not, in fact properly stop—and that is an outcome which it would rate very negatively.
Machines will not value being turned on—if their utility function says that being turned off at that point is of higher utility.
Re: “What is the new utility function?”
There is no new utility function. The utility function is the same as it always was—it is just a utility function that values being gradually shut down at some point in the future.
I don’t pretend that stopping is simple. However, it is one of the simplest things that a machine can do—I figure if we can make machines do anything, we can make them do that.
Re: “If the AI knows it will undergo this transformation in the future, it would erase its own knowledge of the minions it has created, and do other things to ensure that it will be powerless when its utility function changes.”
No, not if it wants to stop, it won’t. That would mean that it did not, in fact properly stop—and that is an outcome which it would rate very negatively.
Machines will not value being turned on—if their utility function says that being turned off at that point is of higher utility.
Re: “What is the new utility function?”
There is no new utility function. The utility function is the same as it always was—it is just a utility function that values being gradually shut down at some point in the future.