Rohin Shah comments on Wireheading as a potential problem with the new impact measure

Rohin Shah 4 Oct 2018 17:00 UTC
LW: 3 AF: 2
AF
But the first action doesn’t strictly improve your ability to get u_A (because you could just wait and execute the plan later), and so intent verification would give it a 1.01 penalty?
- TurnTrout 4 Oct 2018 19:18 UTC
  LW: 3 AF: 2
  AF Parent
  That doesn’t conflict with what I said.
  
  It’s also fine in worlds where these properties really are true. If the agent thinks this is true (but it isn’t), it’ll start acting when it realizes. Seems like a nonissue.
  - Rohin Shah 5 Oct 2018 18:35 UTC
    LW: 3 AF: 2
    AF Parent
    Seems like a nonissue.
    I’m not claiming it’s an issue, I’m trying to understand what AUP does. Your response to comments is frequently of the form “AUP wouldn’t do that” so afaict none of the commenters (including me) groks your conception of AUP, so I’m trying to extract simple implications and see if they’re actually true in an attempt to grok it.
    That doesn’t conflict with what I said.
    I can’t tell if you agree or disagree with my original claim. “Don’t think so in general?” implies not, but this implies you do?
    If you disagree with my original claim, what’s an example with deterministic known dynamics, where there is an optimal plan to achieve maximal u_A that can be executed at any time, where AUP with intent verification will execute that plan before the last possible moment in the epoch?
    - TurnTrout 5 Oct 2018 19:04 UTC
      LW: 4 AF: 3
      AF Parent
      I agree with what you said for those environments, yeah. I was trying to express that I don’t expect this situation to be common, which is beside the point in light of your motivation for asking!
      
      (I welcome these questions and hope my short replies don’t come off as impatient. I’m still dictating everything.)
      - Rohin Shah 9 Oct 2018 6:41 UTC
        LW: 1 AF: 1
        AF Parent
        Cool, thanks!