habryka comments on Alexander Gietelink Oldenziel’s Shortform

habryka 16 Jul 2024 20:30 UTC
5 points
2
As one relevant consideration, I think the topic of “will AI kill all humans” is a question whose answer relies in substantial parts on TDT-ish considerations, and is something that a bunch of value systems I think reasonably care a lot about. Also I think what superintelligent systems will do will depend a lot on decision-theoretic considerations that seem very hard to answer from a CDT vs. EDT-ish frame.
- Alexander Gietelink Oldenziel 16 Jul 2024 22:33 UTC
  5 points
  3
  Parent
  I think I speak for many when I ask you to please elaborate on this!
  - habryka 16 Jul 2024 23:36 UTC
    7 points
    0
    Parent
    Oh, I thought this was relatively straightforward and has been discussed a bunch. There are two lines of argument I know for why superintelligent AI, even if unaligned, might not literally kill everyone, but keep some humans alive:
    The AI might care a tiny bit about our values, even if it mostly doesn’t share them
    The AI might want to coordinate with other AI systems that reached superintelligence to jointly optimize the universe. So in a world where there is only a 1% chance that we align AI systems to our values, then even in unaligned worlds we might end up with AI systems that adopt our values as a 1% mixture in its utility function (and also consequently in those 1% of worlds, we might still want to trade away 99% of the universe to the values that the counterfactual AI systems would have had)
    Some places where the second line of argument has been discussed:
    This comment by Ryan Greenblatt:^[1] https://www.lesswrong.com/posts/tKk37BFkMzchtZThx/miri-2024-communications-strategy?commentId=xBYimQtgASti5tgWv
    This comment by Paul Christiano:^[2] https://www.lesswrong.com/posts/2NncxDQ3KBDCxiJiP/cosmopolitan-values-don-t-come-free?commentId=ofPTrG6wsq7CxuTXk
    ^
    This is due to:
    The potential for the AI to be at least a tiny bit “kind” (same as humans probably wouldn’t kill all aliens). ^[1]
    Decision theory/trade reasons
    ^
    Note that in this comment I’m not touching on acausal trade (with successful humans) or ECL. I think those are very relevant to whether AI systems kill everyone, but are less related to this implicit claim about kindness which comes across in your parables (since acausally trading AIs are basically analogous to the ants who don’t kill us because we have power).
    - Raemon 16 Jul 2024 23:50 UTC
      8 points
      3
      Parent
      See also: https://www.lesswrong.com/posts/rP66bz34crvDudzcJ/decision-theory-does-not-imply-that-we-get-to-have-nice