The point is that the AI turns itself off after it fulfills its utility function, just like a called function. It doesn’t maximize utility, it sets utility to greater than 25. There is no ghost in the machine that wants to be alive, and after its utility function is met, it will be inert.
None of the objections listed in the ‘stop button’ hypothetical apply.
I’m not sure I understand your objection to finite utility. Here’s my model of what you’re saying:
I have a superintelligent agent with a utility ceiling of 25, where utility is equivalent to paperclips in the cup. In order to maximize the probability of attaining 25 utils in the shortest possible time, the agent does a million very unusual and some damaging things, and after 25 paperclips are in the cup, the AI shuts off.
Now, this is not great and might have led to some deaths. But it seems to me to be far less likely to lead to the end of the world than an unbounded utility agent. I’m not saying that utility ceilings are a panacea, but they might be a useful tool to use in concert with other safety precautions.
The point is that the AI turns itself off after it fulfills its utility function, just like a called function. It doesn’t maximize utility, it sets utility to greater than 25. There is no ghost in the machine that wants to be alive, and after its utility function is met, it will be inert.
None of the objections listed in the ‘stop button’ hypothetical apply.
I’m not sure I understand your objection to finite utility. Here’s my model of what you’re saying:
Now, this is not great and might have led to some deaths. But it seems to me to be far less likely to lead to the end of the world than an unbounded utility agent. I’m not saying that utility ceilings are a panacea, but they might be a useful tool to use in concert with other safety precautions.