David Scott Krueger (formerly: capybaralet) comments on Disambiguating “alignment” and related notions

David Scott Krueger (formerly: capybaralet)Jun 5, 2018, 7:59 PM
3 points
This is not what I meant by “the same values”, but the comment points towards an interesting point.
When I say “the same values”, I mean the same utility function, as a function over the state of the world (and the states of “R is having sex” and “H is having sex” are different).
The interesting point is that states need to be inferred from observations, and it seems like there are some fundamentally hard issues around doing that in a satisfying way.
- Viliam Jun 5, 2018, 11:09 PM
  4 points
  Parent
  Related to the distinction between 2-place and 1-place words. We want the AI to have the “curried” version of human values, not a symmetric version where the word “me” now refers to the AI itself.
  - David Scott Krueger (formerly: capybaralet)Jun 10, 2018, 2:14 PM
    2 points
    Parent
    Can you please explain the distinction more succinctly, and say how it is related?
- TAG Jun 8, 2018, 11:27 AM
  2 points
  Parent
  
  the states of “R is having sex” and “H is having sex” are different)
  
  In that case, “paradise for R” and “paradise for H” are different. You need to check out “centered worlds”.
- avturchin Jun 5, 2018, 8:23 PM
  0 points
  Parent
  In your definition this distinction about the state of the world is not obvious, as usually humans use the words “have the same values” not about the state of the world, but meaning having the same set of preferences but centered around a different person.
  In that case, situation becomes creepy: for example, I want to have sex with human females. A holistically aligned with me AI will also want to have sex with human females—but I am not happy about it, and the females will also find it creepy. More over, if such AI will have 1000 times of my capabilities, it will completely dominate sexual market, getting almost all sex in the world and humans will go extinct.
  - Vanessa Kosoy Jun 6, 2018, 7:15 AM
    5 points
    Parent
    An AI with avturchin-like values centered around the AI would completely dominate the sexual market, if and only if, avturchin would completely dominate the sexual market given the opportunity. More generally, having an AI with human-like values centered around itself is only as bad as having as having an AI holistically aligned (in the original, absolute, sense) with a human who is not you.
    As an aside, why would I find it creepy if a godlike superintelligence wants to have sex with me? It’s kinda hot actually :)
    - avturchin Jun 6, 2018, 10:36 AM
      3 points
      Parent
      I could imagine the following conversation with holistically aligned AI:
      -Mom, I decided to become homosexual.
      -No, you will not do it, because heterosexuality was your terminal value at the moment of my creation.
    - [ ]
      
      [deleted]