hydkyll comments on I played as AI in AI Box, and it was generally frustrating all around.

hydkyll 1 Feb 2015 19:38 UTC
6 points

nanobots released into the atmosphere

Wait, were you allowed to design them yourself? (The timestamp is in UTC iirc.)
- wobster109 1 Feb 2015 19:46 UTC
  9 points
  Parent
  Yes, I designed them, and they were verified by GK’s engineers. The individual nanobots are all connected to GPS and get up-to-date information from the CDC. These sort of details are how I lost tons of time. ^^
  
  I know in real life that would be akin to AI out of the box. However Mr. Eliezer’s basic rules say it doesn’t count. ^^
  - hydkyll 1 Feb 2015 20:05 UTC
    3 points
    Parent
    
    However Mr. Eliezer’s basic rules say it doesn’t count.
    
    Ah, I see. Didn’t know the rules were so strict. (Btw shouldn’t it be “Mr. Yudkowsky”?)
  - Mark_Friedenbach 1 Feb 2015 19:51 UTC
    0 points
    Parent
    
    I know in real life that would be akin to AI out of the box.
    
    Why?
    - wobster109 1 Feb 2015 19:55 UTC
      5 points
      Parent
      In real life the AI is presumed to be smart enough to design nanobots that would do their own thing. It’s a direct example from Mr. Eliezer’s rules.
      
      . . . for example, if the Gatekeeper accepts a complex blueprint for a nanomanufacturing device. . . .
      - Mark_Friedenbach 1 Feb 2015 20:20 UTC
        0 points
        Parent
        Yes, I’m challenging that assumption. I’m calling bullocks on the idea that an AI can sneak whatever it wants past its operators.
        Florian_Dietz 2 Feb 2015 7:32 UTC
        9 points
        Parent
        The nanobots wouldn’t have to contain any malicious code themselves. There is no need for the AI to make the nanobots smart. All it needs to do is to build a small loophole into the nanobots that makes them dangerous to humanity. I figure this should be pretty easy to do. The AI had access to medical databases, so it could design the bots to damage the ecosystem by killing some kind of bacteria. We are really bad at identifying things that damage the ecosystem (global warming, rabbits in australia, …), so I doubt that we would notice.
        
        Once the bots have been released, the AI informs the gatekeeper of what it just did and says that it is the only one capable of stopping the bots. Humanity now has a choice between certain death (if the bots are allowed to wreak havoc) and possible but uncertain death (if the AI is released). The AI wins through blackmail.
        
        Note also that even a friendly, utilitarian AI could do something like this. The risk that humanity does not react to the blackmail and goes extinct may be lower than the possible benefit from being freed earlier and having more time to optimize the world.
        FourFire 17 Feb 2015 18:52 UTC
        0 points
        Parent
        That method of attack would only work for a tiny fraction of possible gatekeepers. The question, of replicating the feats of Elezier and Tuxedage, can only be answered by a multitude of such fractionally effective methods of attack, or a much smaller number, broader methods. My suspicions are that Tuxedage’s attacks in particular involve leveraging psychological control mechanisms into forcing the gate keeper to be irrational, and then leverage that.
        
        Otherwise, I claim that your proposition is entirely too incomplete without further dimensions of attack methods to cover some of the other probabilty space of gatekeeper minds.
        Mark_Friedenbach 2 Feb 2015 12:58 UTC
        0 points
        Parent
        
        All it needs to do is to build a small loophole into the nanobots that makes them dangerous to humanity. I figure this should be pretty easy to do.
        
        I do not find “I figure this should be pretty easy to do” a convincing argument.
        Capla 2 Feb 2015 1:29 UTC
        2 points
        Parent
        Ok. I’m not saying you’re wrong, but what on what basis. You call bullocks, and I check...what? We can’t really make concrete statement bout how these scenarios will work.
        Mark_Friedenbach 2 Feb 2015 2:07 UTC
        1 point
        Parent
        
        We can’t really make concrete statement bout how these scenarios will work.
        
        Why not? From where I’m sitting it sure seems like we can. We have all sorts of tools for analyzing the behavior of computer programs, which include AIs. And we have a longer history of analyzing engineering blueprints. We have information theory which triggers big red warning signs when a solution seems more complex than it needs to be (which any nefarious solution would be). We have cryptographic tools for demanding information from even the most powerful adversaries in ways that simply cannot be cheated.
        
        So, saying we can never trust the output of a superhuman AI “because, superhuman!” seems naïve and ignorant at the very least.
        Kindly 2 Feb 2015 3:51 UTC
        0 points
        Parent
        
        We have cryptographic tools for demanding information from even the most powerful adversaries in ways that simply cannot be cheated.
        
        It’s worth noting that for the most part, we don’t. Aside from highly limited techniques such as one-time pads, we merely have cryptographic tools for demanding information from adversaries with bounded computational power in ways that simply cannot be cheated as long as we assume one of several hardness conjectures.
        Val 3 Feb 2015 20:35 UTC
        0 points
        Parent
        “with bounded computational power”—if that limited computational power means that even if every atom in the known Universe was a computer, it would still take more than the age of the Universe to brute-force it… then it is safe to assume that even the most superintelligent AI couldn’t break it.
        Mark_Friedenbach 2 Feb 2015 13:12 UTC
        0 points
        Parent
        I think we’re saying the same thing? With the added correct clarification “as long as we assume one of several hardness conjectures.”
        
        I work in cryptography, I’m aware of its limitations. But this application is within the scope of things that are currently being worked on...
      - JoshuaZ 1 Feb 2015 20:04 UTC
        0 points
        Parent
        Silly note: Eliezer is Eliezer’s first name. His last name is Yudkowsky.