Are you looking for a utility function that depends only upon external snapshot state of the universe? Or are you considering utility functions that evaluate history and internal states as well? This is almost never made clear in such questions, and amphiboly is rife in many discussions about utility functions.
My ideas aren’t really formalized, but I’m imagining that NormalUtilityFunction would be based on just the external state of the universe and that the full utility function with pausing would just add the arguably internal states of (paused) and (taking actions).
Ah, that does make it almost impossible then. Such a utility function when paused must have constant value for all outcomes, or it will have incentive to do something. Then in the non-paused state the otherwise reachable utility is either greater than that (in which case it has incentive to prevent being paused) or less than or equal (in which case its best outcome it to make itself paused).
Are you looking for a utility function that depends only upon external snapshot state of the universe? Or are you considering utility functions that evaluate history and internal states as well? This is almost never made clear in such questions, and amphiboly is rife in many discussions about utility functions.
My ideas aren’t really formalized, but I’m imagining that NormalUtilityFunction would be based on just the external state of the universe and that the full utility function with pausing would just add the arguably internal states of (paused) and (taking actions).
Ah, that does make it almost impossible then. Such a utility function when paused must have constant value for all outcomes, or it will have incentive to do something. Then in the non-paused state the otherwise reachable utility is either greater than that (in which case it has incentive to prevent being paused) or less than or equal (in which case its best outcome it to make itself paused).