Alexander Gietelink Oldenziel comments on Alexander Gietelink Oldenziel’s Shortform

Alexander Gietelink Oldenziel 16 Jul 2024 22:33 UTC
5 points
3
I think I speak for many when I ask you to please elaborate on this!
- habryka 16 Jul 2024 23:36 UTC
  7 points
  0
  Parent
  Oh, I thought this was relatively straightforward and has been discussed a bunch. There are two lines of argument I know for why superintelligent AI, even if unaligned, might not literally kill everyone, but keep some humans alive:
  1. The AI might care a tiny bit about our values, even if it mostly doesn’t share them
  2. The AI might want to coordinate with other AI systems that reached superintelligence to jointly optimize the universe. So in a world where there is only a 1% chance that we align AI systems to our values, then even in unaligned worlds we might end up with AI systems that adopt our values as a 1% mixture in its utility function (and also consequently in those 1% of worlds, we might still want to trade away 99% of the universe to the values that the counterfactual AI systems would have had)
  Some places where the second line of argument has been discussed:
  - This comment by Ryan Greenblatt:^[1] https://www.lesswrong.com/posts/tKk37BFkMzchtZThx/miri-2024-communications-strategy?commentId=xBYimQtgASti5tgWv
  - This comment by Paul Christiano:^[2] https://www.lesswrong.com/posts/2NncxDQ3KBDCxiJiP/cosmopolitan-values-don-t-come-free?commentId=ofPTrG6wsq7CxuTXk
  1. ^
    This is due to:
    The potential for the AI to be at least a tiny bit “kind” (same as humans probably wouldn’t kill all aliens). ^[1]
    Decision theory/trade reasons
  2. ^
    Note that in this comment I’m not touching on acausal trade (with successful humans) or ECL. I think those are very relevant to whether AI systems kill everyone, but are less related to this implicit claim about kindness which comes across in your parables (since acausally trading AIs are basically analogous to the ants who don’t kill us because we have power).
  - Raemon 16 Jul 2024 23:50 UTC
    8 points
    3
    Parent
    See also: https://www.lesswrong.com/posts/rP66bz34crvDudzcJ/decision-theory-does-not-imply-that-we-get-to-have-nice