thought experiment; is there any utility function a paper-clip maximiser could switch to which would result in a universe containing more paper-clips?
Yes. Suppose the paperclip maximizer inhabits the same universe as a bobby-pin maximizer. The two agents interact in a cooperative game which has a (Nash) bargaining solution that provides more of both desirable artifacts than either player could achieve without cooperating. It is well known that cooperative play can be explained as a kind of utilitarianism—both players act so as to maximize a linear combination of their original utility functions. If the two agents have access to each other’s source code, and if the only way for them to enforce the bargain is to both self-modify so as to each maximize the new joint utility function, then they both gain by doing so.
The problem is that if the universe changes, and/or their understanding of the universe changes, one or both of the agents may come to regret the modification—there may be a new bargain—better for one or both parties, that is no longer achievable after they self-modified. So, irrevocable self-modification may be a bad idea in the long term. But it can sometimes be a good idea in the short term.
An easier way to see this point is to simply notice that to make a promise is to (in some sense) self-modify your utility function. And, under certain circumstances, it is rational to make a promise with the intent of keeping it.
Yes. Suppose the paperclip maximizer inhabits the same universe as a bobby-pin maximizer. The two agents interact in a cooperative game which has a (Nash) bargaining solution that provides more of both desirable artifacts than either player could achieve without cooperating. It is well known that cooperative play can be explained as a kind of utilitarianism—both players act so as to maximize a linear combination of their original utility functions. If the two agents have access to each other’s source code, and if the only way for them to enforce the bargain is to both self-modify so as to each maximize the new joint utility function, then they both gain by doing so.
The problem is that if the universe changes, and/or their understanding of the universe changes, one or both of the agents may come to regret the modification—there may be a new bargain—better for one or both parties, that is no longer achievable after they self-modified. So, irrevocable self-modification may be a bad idea in the long term. But it can sometimes be a good idea in the short term.
An easier way to see this point is to simply notice that to make a promise is to (in some sense) self-modify your utility function. And, under certain circumstances, it is rational to make a promise with the intent of keeping it.
Eeek! As I may have previously mentioned, you are planning on putting way more stuff in there than is a good idea, IMHO.