I think the confusing part is “Impact is change to our ability to achieve goals.”
This makes me think that “allowing itself to be put into a box” is high impact since that’s a drastic change to it’s ability to achieve its goals. This also applies to instrumental convergence, “seizing control”, since that’s also a drastic change to it’s attainable utility. This understanding would imply a high penalty for instrumental convergence AND shut-off (We want the first one, but not the second)
“Impact is with respect to the status quo, to if it does nothing” fixes that; however, changing your succinct definition of impact to “Impact is change to our ability to achieve goals relative to doing nothing” would make it less fluent (and less comprehensible!)
I think the confusing part is “Impact is change to our ability to achieve goals.”
This makes me think that “allowing itself to be put into a box” is high impact since that’s a drastic change to it’s ability to achieve its goals. This also applies to instrumental convergence, “seizing control”, since that’s also a drastic change to it’s attainable utility. This understanding would imply a high penalty for instrumental convergence AND shut-off (We want the first one, but not the second)
“Impact is with respect to the status quo, to if it does nothing” fixes that; however, changing your succinct definition of impact to “Impact is change to our ability to achieve goals relative to doing nothing” would make it less fluent (and less comprehensible!)