The problem with the first part of this sequence is that it can seem… obvious… until you realize that almost all prior writing about impact has not even acknowledged that we want the AI to leave us able to get what we want (to preserve our attainable utility).
Agreed. This has been my impression from reading previous work on impact.
Let me substantiate my claim a bit with a random sampling; I just pulled up a relative reachability blogpost. From the first paragraph, (emphasis mine)
An incorrect or incomplete specification of the objective can result in undesirable behavior like specification gaming or causing negative side effects. There are various ways to make the notion of a “side effect” more precise – I think of it as a disruption of the agent’s environment that is unnecessary for achieving its objective. For example, if a robot is carrying boxes and bumps into a vase in its path, breaking the vase is a side effect, because the robot could have easily gone around the vase. On the other hand, a cooking robot that’s making an omelette has to break some eggs, so breaking eggs is not a side effect.
But notice now we’re talking about “disruption of the agent’s environment”. Relative reachability is indeed tackling the impact measure problem, so using what we now understand we might prefer to reframe as:
We think about “side effects” when they change our attainable utilities, so they’re really just a conceptual discretization of “things which negatively affect us”. We want the robot to prefer policies which avoid overly changing our attainable utilities. For example, if a robot is carrying boxes and bumps into a vase in its path, breaking the vase is a side effect, because it’s not that easy for us to repair the vase...
Agreed. This has been my impression from reading previous work on impact.
Let me substantiate my claim a bit with a random sampling; I just pulled up a relative reachability blogpost. From the first paragraph, (emphasis mine)
But notice now we’re talking about “disruption of the agent’s environment”. Relative reachability is indeed tackling the impact measure problem, so using what we now understand we might prefer to reframe as: