But, the waste heat from its computation will move at least a few ounces of air.
Quite so. The waste heat, of course, has very little thermodynamically significant direct impact on the rest of the world—but by the same token, removing someone’s frontal lobe or not has a smaller, more indirect impact on the world than preventing the bomb from detonating or not.
Now, suppose the AI’s grasp of causal structure is sufficient that it will indeed only take actions that truly have minimal impact vs. nonaction; in this case it will be unable to communicate with humans in ways that are expected to result in significant changes to the human’s future behavior, making it a singularly useless oracle.
My intuition here is that the insights required for any specification of what causal results of action are acceptable is roughly equivalent to what is necessary to specify something like CEV (i.e., essentially what Warrigal said above) in that both require the AI have, roughly speaking, the ability to figure out what people actually want, not what they say they want. If you’ve done it right, you don’t need additional safeguards such as preventing significant effects; if you’ve done it wrong, you’re probably screwed anyways.
Quite so. The waste heat, of course, has very little thermodynamically significant direct impact on the rest of the world—but by the same token, removing someone’s frontal lobe or not has a smaller, more indirect impact on the world than preventing the bomb from detonating or not.
Now, suppose the AI’s grasp of causal structure is sufficient that it will indeed only take actions that truly have minimal impact vs. nonaction; in this case it will be unable to communicate with humans in ways that are expected to result in significant changes to the human’s future behavior, making it a singularly useless oracle.
My intuition here is that the insights required for any specification of what causal results of action are acceptable is roughly equivalent to what is necessary to specify something like CEV (i.e., essentially what Warrigal said above) in that both require the AI have, roughly speaking, the ability to figure out what people actually want, not what they say they want. If you’ve done it right, you don’t need additional safeguards such as preventing significant effects; if you’ve done it wrong, you’re probably screwed anyways.