Shmi comments on The E-Coli Test for AI Alignment

Shmi 16 Dec 2018 19:20 UTC
5 points
0
The OP didn’t equate humans with bacteria, but offered an outside view that we humans being inside tend to not notice. Of course we are somewhat more complex than E.Coli. We know that and we can see that easily. The blind spots lie where we are no different, and blind spots is what gets in the way of the recent MIRI’s buzzword, deconfusion.
Further, to nitpick your points:
“We will know that an AI system has been created”—why are you so sure? How would you recognize an AI that someone else has created without telling you? Maybe we would interpret it through the similar prism through which an E.coli interprets its environment: “Is it food? Is it a danger?” not being able to fathom anything more complex than that.
“We will have designed the AI system ourselves”—that is indeed the plan, and, arguably, in a fixed-rules environment we already have, the AlphaZero. So, if someone formalizes the human interaction rules enough, odds are, something like AlphaZero would be able to self-train to be more human than any human in a short time.
“We can answer questions posed to us by the AI system”—Yes, but our answers are not a reliable source of truth, they are mostly post hoc rationalizations. It has been posted here before (I can’t seem to find the link) that answers to “why” questions are much less reliable than the answers to “what” questions.
- Benquo 16 Dec 2018 20:49 UTC
  14 points
  0
  Parent
  
  “We will know that an AI system has been created”—why are you so sure? How would you recognize an AI that someone else has created without telling you? Maybe we would interpret it through the similar prism through which an E.coli interprets its environment: “Is it food? Is it a danger?” not being able to fathom anything more complex than that.
  
  I feel like we are kind of in this position relative to The Economy.
- Rohin Shah 17 Dec 2018 8:44 UTC
  5 points
  0
  Parent
  The OP didn’t equate humans with bacteria, but offered an outside view that we humans being inside tend to not notice. Of course we are somewhat more complex than E.Coli. We know that and we can see that easily.
  The title of this post is “The E-Coli Test for AI Alignment”. The first paragraph suggests that any good method for AI alignment should also work on E-Coli. That is the claim I am disputing. Do you agree or disagree with that claim?
  Perhaps the post was meant in the weaker sense that you mention, which I mostly agree with, but that’s not the impression I get from reading the post.
  “We will know that an AI system has been created”—why are you so sure? How would you recognize an AI that someone else has created without telling you? Maybe we would interpret it through the similar prism through which an E.coli interprets its environment: “Is it food? Is it a danger?” not being able to fathom anything more complex than that.
  I am super confused about what you are thinking here. At some point a human is going to enter a command or press a button that causes code to start running. That human is going to know that an AI system has been created. (I’m not arguing that all humans will know that an AI system has been created, though we could probably arrange for most humans to know this if we wanted.)
  that is indeed the plan, and, arguably, in a fixed-rules environment we already have, the AlphaZero. So, if someone formalizes the human interaction rules enough, odds are, something like AlphaZero would be able to self-train to be more human than any human in a short time.
  I don’t see how this is a nitpick of my point.
  Yes, but our answers are not a reliable source of truth, they are mostly post hoc rationalizations. It has been posted here before (I can’t seem to find the link) that answers to “why” questions are much less reliable than the answers to “what” questions.
  Sure. They nonetheless contain useful information, in a way that E coli may not. See for example Inverse Reward Design.
  - Shmi 18 Dec 2018 5:45 UTC
    2 points
    0
    Parent
    Well, first, you are an expert in the area, someone who probably put 1000 times more effort into figuring things out, so it’s unwise for me to think that I can say anything interesting to you in an area you have thought about. I have been on the other side of such a divide in my area of expertise, and it is easy to see a dabbler’s thought processes and the basic errors they are making a mile away. Bur since you seem to be genuinely asking, I will try to clarify.
    At some point a human is going to enter a command or press a button that causes code to start running. That human is going to know that an AI system has been created. (I’m not arguing that all humans will know that an AI system has been created,
    Right, those who are informed, would know. Those who are not informed may or may not figure it out on their own, and with minimal effort the AI hand can probably be masked as a natural event. Maybe I misinterpreted your point. Mine was that, just like an E.coli would not recognize an agent, neither would humans if it wasn’t something we are already primed to recognize.
    My other point was indeed nto a nitpick, more about a human-level AI requiring a reasonable formalization of the game of human interaction, rather than any kind of a new learning mechanism, those are already good enough. Not an AGI, but a domain AI for a specific human domain that is not obviously a game. Examples might be a news source, an emotional support bot, a science teacher, a poet, an artist…
    They nonetheless contain useful information, in a way that E coli may not. See for example Inverse Reward Design.
    Interesting link, thanks! Right, the information can be useful, even if not truthful, as long as the asker can evaluate the reliability of the reply.
    - Rohin Shah 21 Dec 2018 12:57 UTC
      2 points
      0
      Parent
      Right, those who are informed, would know. Those who are not informed may or may not figure it out on their own, and with minimal effort the AI hand can probably be masked as a natural event. Maybe I misinterpreted your point. Mine was that, just like an E.coli would not recognize an agent, neither would humans if it wasn’t something we are already primed to recognize.
      Yup, agreed. All of the “we”s in my original statement (such as “We will know that an AI system has been created”) were meant to refer to the people who created and deployed the AI system, though I now see how that was confusing.