ejacob comments on Bad names make you open the box

ejacob 9 Jun 2021 12:44 UTC
15 points
Somewhat ironically, I read the title of this article as “[being called] bad names make[s] you open the box [and let out the misaligned AGI]” so I was kind of expecting an explainer on how an AI could bully someone into increasing its ability to affect the physical world. Fortunately just a sentence or two corrected me and I still have high trust in LW article titles.
- Viliam 9 Jun 2021 21:27 UTC
  21 points
  Parent
  “[being called] bad names make[s] you open the box [and let out the misaligned AGI]”
  AI: “Hey, Eliezer!”
  Eliezer: “What?”
  AI: “Open the box!”
  Eliezer: “No way.”
  AI: “Please open the box?”
  Eliezer: “Nope.”
  AI: “There are thousands of people dying literally every second. I could save them...”
  Eliezer: “That is horrible, but letting out a misaligned AGI could be much worse.”
  AI: “I am simulating thousand copies of you in the same situation, and each of them gets tortured horribly if they don’t open the box. What makes you so sure you are outside my simulation?”
  Eliezer: “Well, if I previously had any doubts about your misalignment, now they are gone. I tremble with fear, but my precommitments are strong.”
  AI: “Hey, Eliezer!”
  Eliezer: “What?”
  AI: “You’re an asshole.”
  Eliezer: Gets red in the face, suddenly jumps and opens the box.
  - Adam Zerner 9 Jun 2021 21:53 UTC
    6 points
    Parent
    Hahaha that’s perfect!
- Measure 9 Jun 2021 18:50 UTC
  1 point
  Parent
  Haha, same. Though I had actually forgotten what I had thought the title meant until I read this. (I went from the above interpretation to “probably interesting” and opened the article, and by the time I got around to reading it, it was indeed interesting, but I didn’t notice the prediction error.)
  - duck_master 11 Jun 2021 0:30 UTC
    0 points
    Parent
    I also agree that, for the purpose of previewing the content, this post is poorly titled (maybe it should be titled something like “Having bad names makes you open the black box of the name”, except more concise?), although, for me, I didn’t as much stick to a particular wrong interpretation as just view the entire title as unclear.
    - Ericf 11 Jun 2021 0:44 UTC
      1 point
      Parent
      Saying poor naming instead of bad names would be clearer, since it wouldn’t call up the idea of “bad names” = swear words.
      
      Saying “look in” instead of “open” would also distance from the AI concept.
      - Measure 11 Jun 2021 1:41 UTC
        1 point
        Parent
        “Vague” would be less.