VAuroch comments on What should a friendly AI do, in this situation?

VAuroch 9 Aug 2014 23:12 UTC
0 points
How is it supposed to know whether that precommitment is worthwhile without simulating the results either way? Even if an AI doesn’t intend to be manipulative, it’s still going to simulate the results to decide whether that decision is correct.
- ChristianKl 9 Aug 2014 23:44 UTC
  1 point
  Parent
  
  How is it supposed to know whether that precommitment is worthwhile without simulating the results either way?
  
  Because the programmer tells the FAI that part of being a FAI means being precommitted not to manipulate the programmer.
  - VAuroch 10 Aug 2014 0:49 UTC
    0 points
    Parent
    Why would the programmer do this? It’s unjustified and seems necessarily counterproductive in some perfectly plausible scenarios.
    - ChristianKl 10 Aug 2014 9:48 UTC
      0 points
      Parent
      Because most of the scenario’s where the AI manipulates are bad. The AI is not supposed to manipulate just because it get’s a utility calculation wrong.
      - VAuroch 10 Aug 2014 9:57 UTC
        2 points
        Parent
        
        Because most of the scenario’s where the AI manipulates are bad.
        
        You really aren’t sounding like you have any evidence other than your gut, and my gut indicates the opposite. Precommiting never to use a highly useful technique regardless of circumstance is a drastic step, which should have drastic benefits or avoid drastic drawbacks, and I don’t see why there’s any credible reason to think either of those exist and outweigh their reverses.
        
        Or in short: Prove it.
        
        On a superficial note, you have two extra apostrophes in this comment; in “scenario’s” and “get’s”.
        ChristianKl 10 Aug 2014 10:36 UTC
        0 points
        Parent
        If you want an AI that’s maximally powerful why limit it’s intelligence growths in the first place?
        
        We want safe AI. Safety means that it’s not necessary to prove harm. Just because the AI calculates that it should be let out of the box doesn’t mean that it should do anything in it’s power to get out.
        VAuroch 10 Aug 2014 11:07 UTC
        2 points
        Parent
        Enforced precommitments like this are just giving the genie rules rather than making the genie trustworthy. They are not viable Friendliness-ensuring constraints.
        
        If the AI is Friendly, it should be permitted to take what actions are necessary. If the AI is Unfriendly, then regardless of limitations imposed it will be harmful. Therefore, impress upon the AI the value we place on our conversational partners being truthful, but don’t restrict it.
        ChristianKl 10 Aug 2014 11:43 UTC
        −2 points
        Parent
        
        If the AI is Unfriendly, then regardless of limitations imposed it will be harmful.
        
        That’s not true. Unfriendly doesn’t mean that the AI necessarily tries to destroy the human race. If you tell the paperclip AI: Produce 10000 paperclips, it might produce no harm. If you tell it to give you as many paperclips as possible it does harm.
        
        When it comes to powerful entities you want checks&balances. The programmers of the AI can do a better job at checks&balances when the AI is completely truthful.
        VAuroch 10 Aug 2014 20:00 UTC
        0 points
        Parent
        Sure, if the scale is lower it’s less likely to produce large-scale harm, but it is still likely to produce small-scale harm. And satisficing doesn’t actually protect against large-scale harm; that’s been argued pretty extensively previously, so the example you provided is still going to have large-scale harm.
        
        Ultimately, though, checks & balances are also just rules for the genie. It’s not going to render an Unfriendly AI Friendly, and it won’t actually limit a superintelligent AI regardless, since they can game you to render the balances irrelevant. (Unless you think that AI-boxing would actually work. It’s the same principle.)
        
        I’m really not seeing anything that distinguishes this from Failed Utopia 4-2. This even one of that genie’s rules!
        ChristianKl 11 Aug 2014 9:20 UTC
        0 points
        Parent
        The fact that they could game you theoretically is why it’s important to give it a precommitment to not game you. To not even think about gaming you.
        VAuroch 11 Aug 2014 20:41 UTC
        0 points
        Parent
        I’m not sure how you could even specify ‘don’t game me’. That’s much more complicated than ‘don’t manipulate me’, which is itself pretty difficult to specify.
        
        This clearly isn’t going anywhere and if there’s an inferential gap I can’t see what it is, so unless there’s some premise of yours you want to explain or think there’s something I should explain, I’m done with this debate.
        Richard_Kennaway 11 Aug 2014 10:22 UTC
        0 points
        Parent
        How do you give a superintelligent AI a precommitment?
        Expand this thread
        ChristianKl 11 Aug 2014 10:51 UTC
        0 points
        Parent
        How do you build a superintelligent AI in the first place? I think there are plenty of ways of allowing the programmers direct access to internal deliberations of the AI and see anything that looks like the AI even thinking about manipulating the programmers as a thread.