Marion Z. comments on Convince me that humanity is as doomed by AGI as Yudkowsky et al., seems to believe

Marion Z. 13 Apr 2022 4:47 UTC
4 points
some sort of general value for life, or a preference for decreased suffering of thinking beings, or the off chance we can do something to help (which i would argue is almost exactly the same low chance that we could do something to hurt it). I didn’t say there wasn’t an alignment problem, just that AGI whose goals don’t perfectly align with those of humanity in general isn’t necessarily catastrophic. Utility functions tend to have a lot of things they want to maximize, with different weights. Ensuring one or more of the above ideas is present in an AGI is important.
- Yitz 13 Apr 2022 16:35 UTC
  3 points
  Parent
  
  preference for decreased suffering of thinking beings
  
  I think that if we can reliably incorporate that into a machine’s utility function, we’d be most of the way to alignment, right?
  - Joel L. 27 Apr 2022 19:19 UTC
    5 points
    Parent
    I gather the problem is that we cannot reliably incorporate that, or anything else, into a machine’s utility function: if it can change its source code (which would be the easiest way for it to bootstrap itself to superintelligence), it can also change its utility function in unpredictable ways. (Not necessarily on purpose, but the utility function can take collateral damage from other optimizations.)
    I’m glad you started this thread: to someone like me who doesn’t follow AI safety closely, the argument starts to feel like, “Assume the machine is out to get us, and has an unstoppable ‘I Win’ button...” It’s worth knowing why some people think those are reasonable assumptions, and why (or if) others disagree with them. It would be great if there was an “AI Doom FAQ” to cover the basics and get newbies and dilettantes up to speed.
    - Yitz 28 Apr 2022 22:27 UTC
      2 points
      Parent
      It would be great if there was an “AI Doom FAQ” to cover the basics and get newbies and dilettantes up to speed.
      I’d recomend https://www.lesswrong.com/posts/LTtNXM9shNM9AC2mp/superintelligence-faq as a good starting point for newcomers.
      - Joel L. 29 Apr 2022 20:59 UTC
        1 point
        Parent
        An excellent primer—thank you! I hope Scott revisits it someday, since it sounds like recent developments have narrowed the range of probable outcomes.