My ideas aren’t really formalized, but I’m imagining that NormalUtilityFunction would be based on just the external state of the universe and that the full utility function with pausing would just add the arguably internal states of (paused) and (taking actions).
Ah, that does make it almost impossible then. Such a utility function when paused must have constant value for all outcomes, or it will have incentive to do something. Then in the non-paused state the otherwise reachable utility is either greater than that (in which case it has incentive to prevent being paused) or less than or equal (in which case its best outcome it to make itself paused).
My ideas aren’t really formalized, but I’m imagining that NormalUtilityFunction would be based on just the external state of the universe and that the full utility function with pausing would just add the arguably internal states of (paused) and (taking actions).
Ah, that does make it almost impossible then. Such a utility function when paused must have constant value for all outcomes, or it will have incentive to do something. Then in the non-paused state the otherwise reachable utility is either greater than that (in which case it has incentive to prevent being paused) or less than or equal (in which case its best outcome it to make itself paused).