“kill all humans, then shut down” is probably the action that most minimizes change. Leaving those buggers alive will cause more (and harder to predict) change than anything else the agent might do.
There’s no way to talk about this in the abstract sense of change—it has to be differential from a counterfactual (aka: causal), and can only be measured by other agents’ evaluation functions. The world changes for lots of reasons, and an agent might have most of it’s impact by PREVENTING a change, or by FAILING to change something that’s within it’s power. Asimov’s formulation included this understanding: A robot may not injure a human being or, through inaction, allow a human being to come to harm.
I agree it doesn’t make sense to talk about this kind of change as what we want impact measures to penalize, but i think you could talk about this abstract sense of change. You could have an agent with beliefs about the world state, and some distance function over world states, and then penalize change in observed world state compared to some counterfactual.
This kind of change isn’t the same thing as perceived impact, however.
“kill all humans, then shut down” is probably the action that most minimizes change. Leaving those buggers alive will cause more (and harder to predict) change than anything else the agent might do.
There’s no way to talk about this in the abstract sense of change—it has to be differential from a counterfactual (aka: causal), and can only be measured by other agents’ evaluation functions. The world changes for lots of reasons, and an agent might have most of it’s impact by PREVENTING a change, or by FAILING to change something that’s within it’s power. Asimov’s formulation included this understanding: A robot may not injure a human being or, through inaction, allow a human being to come to harm.
Yep, been dealing with that issue for some time now ^_^
https://arxiv.org/abs/1705.10720
I agree it doesn’t make sense to talk about this kind of change as what we want impact measures to penalize, but i think you could talk about this abstract sense of change. You could have an agent with beliefs about the world state, and some distance function over world states, and then penalize change in observed world state compared to some counterfactual.
This kind of change isn’t the same thing as perceived impact, however.