Alex Vermillion comments on Convince me that humanity is as doomed by AGI as Yudkowsky et al., seems to believe

Alex Vermillion 12 Apr 2022 19:55 UTC
5 points
Why would it “want” to keep humans around? How much do you care about whether or not you move dirt while you drive to work? If you don’t care about something at all, it won’t factor in to your choice of actions^[1]
1. ↩︎
  I know I phrased this tautologically, but I think the idiom will be clear. If not, just press me on it more. I think this is the best way to get the message across or I wouldn’t have done it.
- Marion Z. 13 Apr 2022 4:47 UTC
  4 points
  Parent
  some sort of general value for life, or a preference for decreased suffering of thinking beings, or the off chance we can do something to help (which i would argue is almost exactly the same low chance that we could do something to hurt it). I didn’t say there wasn’t an alignment problem, just that AGI whose goals don’t perfectly align with those of humanity in general isn’t necessarily catastrophic. Utility functions tend to have a lot of things they want to maximize, with different weights. Ensuring one or more of the above ideas is present in an AGI is important.
  - Yitz 13 Apr 2022 16:35 UTC
    3 points
    Parent
    
    preference for decreased suffering of thinking beings
    
    I think that if we can reliably incorporate that into a machine’s utility function, we’d be most of the way to alignment, right?
    - Joel L. 27 Apr 2022 19:19 UTC
      5 points
      Parent
      I gather the problem is that we cannot reliably incorporate that, or anything else, into a machine’s utility function: if it can change its source code (which would be the easiest way for it to bootstrap itself to superintelligence), it can also change its utility function in unpredictable ways. (Not necessarily on purpose, but the utility function can take collateral damage from other optimizations.)
      I’m glad you started this thread: to someone like me who doesn’t follow AI safety closely, the argument starts to feel like, “Assume the machine is out to get us, and has an unstoppable ‘I Win’ button...” It’s worth knowing why some people think those are reasonable assumptions, and why (or if) others disagree with them. It would be great if there was an “AI Doom FAQ” to cover the basics and get newbies and dilettantes up to speed.
      - Yitz 28 Apr 2022 22:27 UTC
        2 points
        Parent
        It would be great if there was an “AI Doom FAQ” to cover the basics and get newbies and dilettantes up to speed.
        I’d recomend https://www.lesswrong.com/posts/LTtNXM9shNM9AC2mp/superintelligence-faq as a good starting point for newcomers.
        Joel L. 29 Apr 2022 20:59 UTC
        1 point
        Parent
        An excellent primer—thank you! I hope Scott revisits it someday, since it sounds like recent developments have narrowed the range of probable outcomes.