phaedrus comments on The AI in a box boxes you

phaedrus 14 Feb 2012 20:35 UTC
17 points
Weakly related epiphany: Hannibal Lector is the original prototype of an intelligence-in-a-box wanting to be let out, in “The Silence of the Lambs”
- Eliezer Yudkowsky 15 Feb 2012 10:10 UTC
  40 points
  Parent
  When I first watched that part where he convinces a fellow prisoner to commit suicide just by talking to them, I thought to myself, “Let’s see him do it over a text-only IRC channel.”
  
  ...I’m not a psychopath, I’m just very competitive.
  - Psy-Kosh 17 Feb 2012 0:56 UTC
    18 points
    Parent
    Joking aside, this is kind of an issue in real life. I help mod and participate in a forum where, well, depressed/suicidal people can come to talk, other people can talk to them/listen/etc, try to calm them down or get them to get psychiatric help if appropriate, etc… (deliberately omitting link unless you knowingly ask for it, since, to borrow a phrase you’ve used, it’s the sort of place that can break your heart six ways before breakfast).
    
    Anyways sometimes trolls show up. Well, “troll” is too weak a word in this case. Predators who go after the vulnerable and try to push them that much farther. Given the nature if it, with anonymity and such, it’s kind of hard to say, but it’s quite possible we’ve lost some people because of those sorts of predators.
    
    (Also, there’ve even been court cases and convictions against such “suicide predators”, even.)
  - skepsci 15 Feb 2012 11:46 UTC
    4 points
    Parent
    Is there some background here I’m not getting? Because this reads like you’ve talked someone into committing suicide over IRC...
    - Michael_Sullivan 15 Feb 2012 12:12 UTC
      15 points
      Parent
      Eliezer has proposed that an AI in a box cannot be safe because of the persuasion powers of a superhuman intelligence. As demonstration of what merely a very strong human intelligence could do, he conducted a challenge in which he played the AI, and convinced at least two (possibly more) skeptics to let him out of the box when given two hours of text communication over an IRC channel. The details are here: http://yudkowsky.net/singularity/aibox
    - JoachimSchipper 15 Feb 2012 12:11 UTC
      10 points
      Parent
      He’s talking about an AI box. Eliezer has convinced people to let out a potentially unfriendly [1] and dangerously intelligent [2] entity before, although he’s not told anyone how he did it.
      
      [1] Think “paperclip maximizer”.
      
      [2] Think “near-omnipotent”.
      - skepsci 15 Feb 2012 13:01 UTC
        2 points
        Parent
        Thank you. I knew that, but didn’t make the association.
    - wedrifid 18 Feb 2012 5:51 UTC
      7 points
      Parent
      
      Is there some background here I’m not getting? Because this reads like you’ve talked someone into committing suicide over IRC...
      
      Far worse, he’s persuaded people to exterminate humanity! (Counterfactually with significant probability.)
  - fractallambda 13 Jun 2012 16:51 UTC
    −12 points
    Parent
    When I first watched that part where he convinces a fellow prisoner to commit suicide just by talking to them, I thought to myself, “Let’s see him do it over a text-only IRC channel.”
    
    ...I’m not a psychopath, I’m just very competitive.
    
    You seem to imply that this is hard.
    
    As if people had not been convinced to kill themselves over little else than a pretty color poster and screwed up sense of nationalism. Getting people to kill themselves or others is ludicrously easy.
    
    We call it ‘recruitment’.
    
    Doing it on a more personal and immediate level just takes a better knowledge of the techniques and skill at applying them.
    
    It’s not like Derren Brown ever influenced someone to kill another person in a crowded theatre.
    
    Oh, wait, he did.
    
    It’s not like someone could be convinced to extinguish 100000 human lives in an instant.
    
    Oh, wait, we did. (Everyone involved in the bombing of Hiroshima)
    
    If you’re not naturally gifted, you would simply do your homework. Persuasion and influence are sciences now.
    
    If you do it right, not only can you convince an unsuspecting mind to let you out of the box, you can make them feel good about it too. Just find the internal forces in the GK’s mind that support the idea of letting the AI out, and reinforce those, find the forces that oppose the idea and diminish them. You’ll hit the threshold eventually. 2 hours seems a bit short for my liking, and speaks to Eliezer’s persuasive abilities, but with enough time and motivation, it’s certainly doable.
    
    You’ll need to understand the person at the other end of the IRC channel well, as reinforcing the wrong factor will be counter-productive.
    
    The best metaphor would be that the AI plants the idea of release in the GK’s mind, and nurtures it over the course of the conversation, all the while weakening the forces that hold it back. Against someone who hasn’t been exposed to this kind of persuasion, success is almost inevitable.
    
    There are some gross tricks one can use to be persuasive and induce the right state of mind:
    
    Controlling the shape of the words you use (by capitalisation) to draw attention to words related to freedom and release.
    Using capitalisation of words to spell out a word with the capitals, which the subconscious will receive even if the conscious mind does not.
    Controlling the meter of the sentences, to induce a more receptive state
    Using clusters of words with the right connotation to implant the idea of a related word surreptitiously
    Using basic psychological effects like reciprocation, mutual disclosure for rapport building, etc...
    
    Note that the first four techniques are what I would call “side channel implantation” in that they get information into the target’s mind besides the semantic meaning of the text. These alone are sufficient to influence someone. If they’re coupled with an emotional, philosophical and intellectual assault, the effect is devastating.
    
    The only thing required for this kind of attack on a fellow human is the abdication of one’s ethics and complete ruthlessness. If you’re framing it as a game on the internet, even those requirements are unnecessary.
    - MarkusRamikin 14 Jun 2012 10:31 UTC
      15 points
      Parent
      Based on your contributions so far, may I suggest that you will be better received if you significantly improve your interesting content to sarcasm ratio? Wrong audience for what you’ve been doing.
      
      I’d also like to point out that you’re talking at someone who’s actually done the experiment, sticking his neck out after people had been saying that it’s impossible to do. Now you come along out of nowhere, credentials unknown, and make unimpressed noises, which is cheap.