cousin_it comments on Towards a New Impact Measure

cousin_it 19 Sep 2018 20:20 UTC
2 points
I see, so the AI will avoid prefixes of high impact plans. Can we make it avoid high impact plans only?
- TurnTrout 19 Sep 2018 20:24 UTC
  1 point
  Parent
  I don’t see how, if we also want it to be shutdown safe. After all, its model of us could be incorrect, so we might (to its surprise) want to shut it down—without its plans then having predictably higher impact than intended. To me, the prefix method seems more desirable in that way.
  - cousin_it 19 Sep 2018 22:00 UTC
    2 points
    Parent
    What’s the high impact if we shut down the AI while it’s downloading the movie?
    - TurnTrout 19 Sep 2018 22:08 UTC
      1 point
      Parent
      There isn’t in that case; however, from Daniel’s comment (which he was using to make a somewhat different point):
      
      AUP thinks very differently about building a nuclear reactor and then adding safety features than it does about building the safety features and then the dangerous bits of the nuclear reactor
      
      I find this reassuring. If we didn’t have this, we would admit plans which are only low impact if not interrupted.
      - cousin_it 19 Sep 2018 22:35 UTC
        2 points
        Parent
        Is it possible to draw a boundary between Daniel’s case and mine?
        
        TurnTrout 19 Sep 2018 23:02 UTC
        1 point
        Parent
        I don’t see why that’s necessary, since we‘re still able to do both plans?
        
        Looking at it from another angle, agents which avoid freely putting themselves (even temporarily) in instrumentally convergent positions seem safer with respect to unexpected failures, so it might also be desirable in this case even though it isn’t objectively impactful in the classical sense.
        
        cousin_it 19 Sep 2018 23:51 UTC
        3 points
        Parent
        I’m just trying to figure out if things could be neater. Many low-impact plans accidentally share prefixes with high-impact plans, and it feels weird if many of our orders semi-randomly require tweaking N.
        
        TurnTrout 20 Sep 2018 0:25 UTC
        1 point
        Parent
        That’s a good point, and I definitely welcome further thought along these lines. I’ll think about it more as well!