Your idea seems to break when AI is being unpaused: as it has not done any beneficial actions, utility would suddenly go down from “simulated” to “normal”, meaning that AI will likely resist waking it up.
Also, it assumes there is a separate module for making predictions, which cannot be manipulated by the agent. This assumption is not very probable in my view.
If the AI is resisting being turned on, then it would have to be already on, by which point the updates (to the AI’s prior, and score assigned to it) would have already happened.
Also, it assumes there is a separate module for making predictions, which cannot be manipulated by the agent. This assumption is not very probable in my view.
Isn’t this a blocker for any discussion of particular utility functions?
Your idea seems to break when AI is being unpaused: as it has not done any beneficial actions, utility would suddenly go down from “simulated” to “normal”, meaning that AI will likely resist waking it up.
Also, it assumes there is a separate module for making predictions, which cannot be manipulated by the agent. This assumption is not very probable in my view.
If the AI is resisting being turned on, then it would have to be already on, by which point the updates (to the AI’s prior, and score assigned to it) would have already happened.
Isn’t this a blocker for any discussion of particular utility functions?