Lumifer comments on [LINK] Wait But Why—The AI Revolution Part 2

Lumifer 5 Feb 2015 19:59 UTC
2 points

almost none of the scenarios involve simulations that just satisfy biological urges

The issue isn’t whether you would mess with your reward circuitry, the issue is whether you would just discard it altogether and just directly stimulate the reward center.

And appealing to fictional evidence isn’t a particularly good argument.

Anyone trying to modify a habit is trying to modify what behaviors lead to rewards.

See above—modify, yes, jettison the whole system, no.
- pinyaka 5 Feb 2015 20:12 UTC
  0 points
  Parent
  Well, fine. Since the context of the discussion was how optimizers pose existential threats, it’s still not clear why an optimizer that is willing and able to modify it’s reward system would continue to optimize paperclips. If it’s intelligent enough to recognize the futility of wireheading, why isn’t it intelligent enough to recognize behavior that is inefficient wireheading?
  - FeepingCreature 6 Feb 2015 13:37 UTC
    0 points
    Parent
    It wouldn’t.
    
    But I think this is such a basic failure mechanism that I don’t believe an AI could get to superintelligence without somehow valuing the accuracy and completeness of its model.
    
    Solving this problem—somehow! - is part of the “normal” development of any self-improving AI.
    
    Though note that a reward maximizing AI could still be an existential risk by virtue of turning the entire universe into a busy-beaver counter for its reward. Though this presumes it can’t just set reward to float.infinity.
    - pinyaka 6 Feb 2015 15:27 UTC
      0 points
      Parent
      You are the second person to say that the optimization catastrophe includes an assumption that AI arises with a stable value system. That it “somehow” doesn’t become a wirehead. Fair enough. I just missed that we were assuming that.
      - FeepingCreature 7 Feb 2015 17:18 UTC
        0 points
        Parent
        I think the idea is, you need to solve the wireheading for any sort of self-improving AI. You don’t have an AI catastrophe without that, because you don’t have an AI without that (at least not for long).