ChristianKl comments on Open & Welcome Thread—August 2020

ChristianKl 6 Aug 2020 15:35 UTC
2 points
The general sentiment based on which LessWrong is founded assumes that it’s hard to have utility functions that are stable under self-modification and that’s one of the reasons why friendly AGI is a very hard problem.
- Anirandis 6 Aug 2020 15:45 UTC
  2 points
  Parent
  Would it be likely for the utility function to flip *completely*, though? There’s a difference between some drift in the utility function and the AI screwing up and designing a successor with the complete opposite of its utility function.
  - ChristianKl 7 Aug 2020 14:48 UTC
    2 points
    Parent
    Any AGI is likely complex enough that there wouldn’t be a complete opposite but you don’t need that for an AGI that gets rid of all humans.
    - Anirandis 7 Aug 2020 17:33 UTC
      4 points
      Parent
      The scenario I’m imagining isn’t an AGI that merely “gets rid of” humans. See SignFlip.