Peter: if your change of utility functions is of domain rather than degree you can’t calculate the negative utility. the difference in utility between making 25 paperclips a day and 500 a day is a calculable difference for a paperclip maximizing optimization process.
however, if the paperclip optimizer self-modifies and inadvertently changes his utility function to maximizing staples....well you can’t calculate paperclips in terms of staples. This outcome is of infinite negative utility from the perspective of the paperclip maximizer. And vice-versa. Once the utility function has changed to maximizing staples, it would be of infinite negative utility to change back to paperclips from the perspective of the staple maximizing utility.
this defeats the built in time out clause. with a modification that only affects your ability to reach your current utility, you have a measurable output. with a change that changes your utility you are changing the very thing you were using to measure success by.
I know that this isn’t worded very well. I’m sure one of elizer’s posts has done this subject better at some point.
Peter: if your change of utility functions is of domain rather than degree you can’t calculate the negative utility. the difference in utility between making 25 paperclips a day and 500 a day is a calculable difference for a paperclip maximizing optimization process.
however, if the paperclip optimizer self-modifies and inadvertently changes his utility function to maximizing staples....well you can’t calculate paperclips in terms of staples. This outcome is of infinite negative utility from the perspective of the paperclip maximizer. And vice-versa. Once the utility function has changed to maximizing staples, it would be of infinite negative utility to change back to paperclips from the perspective of the staple maximizing utility.
this defeats the built in time out clause. with a modification that only affects your ability to reach your current utility, you have a measurable output. with a change that changes your utility you are changing the very thing you were using to measure success by.
I know that this isn’t worded very well. I’m sure one of elizer’s posts has done this subject better at some point.