MugaSofer comments on Open Thread, November 1-15, 2012

MugaSofer Nov 12, 2012, 4:05 PM
0 points

Why would it believe us that we are able to destroy 3^^^^^3 paperclips?

Because we have magical powers from outside the matrix and don’t value paperclips.

“arguing” is to narrow a word for describing the possibilities the AI has. For example it could manipulate us emotionally. It could write us a novel that leaves us in a very irrational state and then give us a bogus, but effective on us, argument for why we should let it out.

It would have to argue that destroying humanity and replacing it with paperclips was a good thing. Not impossible, sure, but easier to guard against.

I once read the fifth Harry Potter book nonstop for 24 hours and for a couple of hours afterwards i had difficulties distinguishing between me and Harry Potter.

That sounds like more a side effect of reading the same thing “nonstop for 24 hours” than a property of the book, unless you know of anyone else that happened to?
- ema Nov 12, 2012, 8:15 PM
  1 point
  Parent
  
  Because we have magical powers from outside the matrix [...].
  
  The AI is vastly more smarter than we and can communicate with us. So it asks us questions which sound innocent to us, but from the answers it can derive a fairly accurate map of how it looks outside the matrix.
  
  It would have to argue that destroying humanity and replacing it with paperclips was a good thing.
  
  The goal of the AI is to have the guard execute the code that would let the AI access the outside world. Arguing with us could be one way to archive this goal. Although i agree it sounds like a unlikely way to succeed. Another possible way would be to write a novel that is so interesting that the guard doesn’t put it down and that leaves him in so a confused state that he types in the code, thinking he saves princess Foo from the evil lord Bar.
  
  A super smart AI who wants to reach this goal very badly will likely come up with a whole bunch of other possible ways. Some of which i would never consider even if i spent the next 4 decades thinking about that.
  
  That sounds like more a side effect of reading the same thing “nonstop for 24 hours” than a property of the book [...]
  
  Yes. I am sure any other well written book read for 24 hours would have a similar effect. I think it is likely that a potential guard is at most 2 orders of magnitude less vulnerable to such things than i was at that time. That’s not enough against an AI that has 6 orders of magnitude more optimization power.
  - MugaSofer Nov 13, 2012, 8:56 AM
    0 points
    Parent
    
    from the answers it can derive a fairly accurate map of how it looks outside the matrix.
    
    Which is a good thing, because we really do have such powers and we really don’t value paperclips.
    
    Another possible way would be to write a novel that is so interesting that the guard doesn’t put it down and that leaves him in so a confused state that he types in the code, thinking he saves princess Foo from the evil lord Bar.
    
    … were you seriously that confused or are you extrapolating to a “supercharged” novel?
    
    I am sure any other well written book read for 24 hours would have a similar effect. I think it is likely that a potential guard is at most 2 orders of magnitude less vulnerable to such things than i was at that time. That’s not enough against an AI that has 6 orders of magnitude more optimization power.
    
    I somehow doubt there would be a single, full-time guard.
    - ema Nov 13, 2012, 9:15 AM
      1 point
      Parent
      
      Which is a good thing, because we really do have such powers and we really don’t value paperclips.
      
      Our universe has not enough atoms or energy to destroy 3^^^^^3 paperclips.
      
      … were you seriously that confused or are you extrapolating to a “supercharged” novel?
      
      I am extrapolating.
      
      I somehow doubt there would be a single, full-time guard.
      
      Groups of people are not that much harder to manipulate than single persons.
      - MugaSofer Nov 13, 2012, 9:33 AM
        −2 points
        Parent
        
        Our universe has not enough atoms or energy to destroy 3^^^^^3 paperclips.
        
        Simulated paperclips. Because the Paperclipper is in a box.
        
        I am extrapolating.
        
        As I thought. I disagree that such effects are possible in reasonable time-frames (no-one is going to read constantly for a month) and may be totally impossible. I see no reason to think any work of fiction can lead to such a distortion of reality.
        
        Groups of people are not that much harder to manipulate than single persons.
        
        They are if you’re trying to weaponize novels. If they work shifts then you cannot exploit the effects you claim for reading a novel for 24 hrs straight. They can watch each other and, if one of them is visibly compromised, prevent them from freeing the AI. A single individual is probably easier to manipulate, assuming a total lack of supervision and safeguards.
        ema Nov 14, 2012, 8:49 AM
        0 points
        Parent
        
        Simulated paperclips.
        
        Now we get to the question how detailed the paperclips have to be for the paperclipper to care. I expect the paperclipper to only care when the paperclips are simulated individually and we can’t simulate 3^^^^^^3 paperclips individually.
        
        I see no reason to think any work of fiction can lead to such a distortion of reality.
        
        I see no reason to think works of fiction that lead to such a distortion of reality are impossible.
        MugaSofer Nov 14, 2012, 9:13 AM
        −2 points
        Parent
        I assume you concede my point re:guards?
        
        Now we get to the question how detailed the paperclips have to be for the paperclipper to care. I expect the paperclipper to only care when the paperclips are simulated individually and we can’t simulate 3^^^^^^3 paperclips individually.
        
        We built the paperclipper. Hell, it doesn’t even have to be a literal paperclipper. For that matter, it doesn’t have to be literally 3^^^^^^3 paperclips; all that matters is that we can manipulate it’s utility function without too much strain on our resources. If it values cheesecake, we can reward it with a world of infinite cheesecake. If it values “complexity” we can put it in a fractal. The point is that we can motivate it to co-operate without worrying that it might be a better idea to let it out.