Duncan Sabien (Inactive) comments on The Problem

Duncan Sabien (Inactive) 20 Aug 2025 17:03 UTC
9 points
2
Why would modern technology-using humans ‘want’ to destroy the habitats of the monkeys and apes that are the closest thing they still have to a living ancestor in the first place? Don’t we feel gratitude and warmth and empathy and care-for-the-monkey’s-values such that we’re willing to make small sacrifices on their behalf?
(Spoilers: no, not in the vast majority of cases. :/ )
The answer is “we didn’t want to destroy their habitats, in the sense of actively desiring it, but we had better things to do with the land and the resources, according to our values, and we didn’t let the needs of the monkeys and apes slow us down even the slightest bit until we’d already taken like 96% of everything and even then preservation and conservation were and remain hugely contentious.”
You have to be careful with the metaphor, because it can lead people to erroneously assuming that an AI would be at least that nice, which is not at all obvious or likely for various reasons (that you can read about in the book when it comes out in September!). But the thing that justifies treating catastrophic outcomes as the default is that catastrophic outcomes are the default. There are rounds-to-zero examples of things that are 10-10000x smarter than Other Things cooperating with those Other Things’ hopes and dreams and goals and values. That humans do this at all is part of our weirdness, and worth celebrating, but we’re not taking seriously the challenge involved in robustly installing such a virtue into a thing that will then outstrip us in every possible way. We don’t even possess this virtue ourselves to a degree sufficient that an ant or a squirrel standing between a human and something that human wants should feel no anxiety.
- ryan_greenblatt 21 Aug 2025 15:32 UTC
  9 points
  0
  Parent
  
  Don’t we feel gratitude and warmth and empathy and care-for-the-monkey’s-values such that we’re willing to make small sacrifices on their behalf?
  
  People do make small sacrifices on behalf of monkeys? Like >1 / billion of human resources are spent on doing things for monkeys (this is just >$100k per year). And, in the case of AI takeover, 1 / billion could easily suffice to avoid literal human extinction (with some chance of avoiding mass fatalities due to AI takeover). This isn’t to say that after AI takeover humans would have much control over the future or that the situation wouldn’t be very bad on my views (or on the views of most people at least on reflection). Like, even if some (or most/all) humans survive it’s still an x-risk if we lose control over the longer run future.
  
  Like I agree with the claim that people care very little about the interests of monkeys and don’t let them slow them down in the slightest. But, the exact amount of caring humans exhibit probably would suffice for avoiding literal extinction in the case of AIs.
  
  I think your response is “sure, but AIs won’t care at all”:
  
  You have to be careful with the metaphor, because it can lead people to erroneously assuming that an AI would be at least that nice, which is not at all obvious or likely for various reasons (that you can read about in the book when it comes out in September!).
  
  Agree that it’s not obvious and I think I tenatively expect AIs that takeover are less “nice” in this way than humans are. But, I think it’s pretty likely (40%?) they are “nice” enough to care about humans some tiny amount that suffices for avoiding extinction (while also not having specific desires about what to do with humans that interfere with this) and there is also the possibility of (acausal) trade resulting in human survival. In aggregate, I think these make extinction less likely than not. (But don’t mean that the value of the future isn’t (mostly) lost.)
- Buck 20 Aug 2025 20:11 UTC
  2 points
  −4
  Parent
  Obviously (and as you note), this argument doesn’t suggest that humans would all die, it suggests that a bunch of them would die. (An AI estimated that monkey populations are down 90% due to humans.)
  And if we want to know how many exactly would die, we’d have to get into the details, as has been done for example in the comments linked from here.
  So I think that this analogy is importantly not addressing the question you were responding to.
  - Duncan Sabien (Inactive) 21 Aug 2025 5:15 UTC
    4 points
    2
    Parent
    I disagree with your “obviously,” which seems both wrong and dismissive, and seems like you skipped over the sentence that was written specifically in the hopes of preventing such a comment:
    You have to be careful with the metaphor, because it can lead people to erroneously assuming that an AI would be at least that nice, which is not at all obvious or likely for various reasons
    (Like, c’mon, man.)
    - Buck 21 Aug 2025 6:09 UTC
      2 points
      0
      Parent
      Edited, is it clearer now?
      - Duncan Sabien (Inactive) 21 Aug 2025 7:40 UTC
        2 points
        0
        Parent
        No, the edit completely fails to address or incorporate
        You have to be careful with the metaphor, because it can lead people to erroneously assuming that an AI would be at least that nice, which is not at all obvious or likely for various reasons
        ...and now I’m more confused at what’s going on. Like, I’m not sure how you missed (twice) the explicitly stated point that there is an important disanalogy here, and that the example given was more meant to be an intuition pump. Instead you seem to be sort of like “yeah, see, the analogy means that at least some humans would not die!” which, um. No. It would imply that, if the analogy were tight, but I explicitly noted that it isn’t and then highlighted the part where I noted that, when you missed it the first time.
        (I probably won’t check in on this again; it feels doomy given that you seem to have genuinely expected your edit to improve things.)
        Duncan Sabien (Inactive) 21 Aug 2025 7:50 UTC
        5 points
        1
        Parent
        Separately, I will note (shifting the (loose) analogy a little) that if someone were to propose “hey, why don’t we put ourselves in the position of wolves circa 20,000 years ago? Like, it’s actually fine to end up corralled and controlled and mutated according to the whims of a higher power, away from our present values; this is actually not a bad outcome at all; we should definitely build a machine that does this to us,”
        they would be rightly squinted at.
        Like, sometimes one person is like “I’m pretty sure it’ll kill everyone!” and another person responds “nuh-uh! It’ll just take the lightcone and the vast majority of all the resources and keep a tiny token population alive under dubious circumstances!” as if this is, like, sufficiently better to be considered good, and to have meaningfully dismissed the original concern.
        It is better in an absolute sense, but again: “c’mon, man.” There’s a missing mood in being like “yeah, it’s only going to be as bad as what happened to monkeys!” as if that’s anything other than a catastrophe.
        (And again: it isn’t likely to only be as bad as what happened to monkeys.)
        (But even if it were, wolves of 20,000 years ago, if you could contrive to ask them, would not endorse the present state of wolves-and-dogs today. They would not choose that future. Anyone who wants to impose an analogous future on humanity is not a friend, from the perspective of humanity’s values. Being at all enthusiastic about that outcome feels like a cope, or something.)
        ryan_greenblatt 21 Aug 2025 15:35 UTC
        12 points
        2
        Parent
        To be clear, Buck’s view is that it is a very bad outcome if a token population is kept alive (e.g., all/most currently alive humans) but (misaligned) AIs control the vast majority of resources. And, he thinks most of the badness is due to the loss of the vast majority of resources.
        
        He didn’t say “and this would be fine” or “and I’m enthusiastic about this outcome”, he was just making a local validity point and saying you weren’t effectively addressing the comment you were responding to.
        
        (I basically agree with the missing mood point, if I was writing the same comment Buck wrote, I would have more explicitly noted the loss of value and my agreements.)