avturchin comments on Disambiguating “alignment” and related notions

avturchin 5 Jun 2018 19:34 UTC
10 points
Great classification!
Holistically aligned AI is not safe: the last thing we need is the AI which has sexual desire to humans. Instead, AI should correctly and safely understand our commands, that is close to “sufficient alignment” form the above definitions.
- David Scott Krueger (formerly: capybaralet) 5 Jun 2018 19:59 UTC
  3 points
  Parent
  This is not what I meant by “the same values”, but the comment points towards an interesting point.
  When I say “the same values”, I mean the same utility function, as a function over the state of the world (and the states of “R is having sex” and “H is having sex” are different).
  The interesting point is that states need to be inferred from observations, and it seems like there are some fundamentally hard issues around doing that in a satisfying way.
  - Viliam 5 Jun 2018 23:09 UTC
    4 points
    Parent
    Related to the distinction between 2-place and 1-place words. We want the AI to have the “curried” version of human values, not a symmetric version where the word “me” now refers to the AI itself.
    - David Scott Krueger (formerly: capybaralet) 10 Jun 2018 14:14 UTC
      2 points
      Parent
      Can you please explain the distinction more succinctly, and say how it is related?
  - TAG 8 Jun 2018 11:27 UTC
    2 points
    Parent
    
    the states of “R is having sex” and “H is having sex” are different)
    
    In that case, “paradise for R” and “paradise for H” are different. You need to check out “centered worlds”.
  - avturchin 5 Jun 2018 20:23 UTC
    0 points
    Parent
    In your definition this distinction about the state of the world is not obvious, as usually humans use the words “have the same values” not about the state of the world, but meaning having the same set of preferences but centered around a different person.
    In that case, situation becomes creepy: for example, I want to have sex with human females. A holistically aligned with me AI will also want to have sex with human females—but I am not happy about it, and the females will also find it creepy. More over, if such AI will have 1000 times of my capabilities, it will completely dominate sexual market, getting almost all sex in the world and humans will go extinct.
    - Vanessa Kosoy 6 Jun 2018 7:15 UTC
      5 points
      Parent
      An AI with avturchin-like values centered around the AI would completely dominate the sexual market, if and only if, avturchin would completely dominate the sexual market given the opportunity. More generally, having an AI with human-like values centered around itself is only as bad as having as having an AI holistically aligned (in the original, absolute, sense) with a human who is not you.
      As an aside, why would I find it creepy if a godlike superintelligence wants to have sex with me? It’s kinda hot actually :)
      - avturchin 6 Jun 2018 10:36 UTC
        3 points
        Parent
        I could imagine the following conversation with holistically aligned AI:
        -Mom, I decided to become homosexual.
        -No, you will not do it, because heterosexuality was your terminal value at the moment of my creation.
      - [ ]
        [deleted]