Noosphere89 comments on If we solve alignment, do we die anyway?

Noosphere89 23 Aug 2024 16:10 UTC
6 points
−2
I don’t think your scenario works, maybe because I don’t believe that the world is as offense advantaged as you say.

I think the closest domain where things are this offense biased is the biotech domain, and whie I do think biotech leading to doom is something we will eventually have to solve, I’m way less convinced of the assumption that every other domain is so offense advantaged that whoever goes first essentially wins the race.

That said, I’m worried about scenarios where we do solve alignment and get catastrophe anyways. though unlike your scenario, I expect no existential catastrophe to occur, since I do think that humanity’s potential isn’t totally lost.

My expectation, conditional on both alignment being solved and catastrophe still happening, is something close to this scenario by dr_s here:

https://www.lesswrong.com/posts/2ujT9renJwdrcBqcE/the-benevolence-of-the-butcher

While I don’t agree with the claim that this is inevitable, I do think there’s a real chance of this sort of thing happening, and it’s probably one of those threats that could very well materialize if AI automates most of the economy, and that means humans are unemployed.
What links here?
- Seth Herd's comment on My disagreements with “AGI ruin: A List of Lethalities” by Noosphere89 (16 Sep 2024 19:32 UTC; 1 point)
- Seth Herd 23 Aug 2024 16:15 UTC
  6 points
  2
  Parent
  I agree entirely with the points made in that post. AGI will only “transform” the economy temporarily. It will very soon replace the economy. That is an entirely separate concern.
  
  If you don’t think a multipolar scenario is as offense-advantaged as I’ve described, where do you think the argument breaks down? What defensive technologies are you envisioning that could counter the types of offensive strategies I’ve mentioned?
  - Noosphere89 23 Aug 2024 16:38 UTC
    4 points
    0
    Parent
    Okay, I’m not sure the argument breaks down, but my crux is that everyone else probably has an AGI, and my issue is similar to Richard Ngo’s issue with ARA: the people ordering ARA have far fewer resources to put into attack compared to the defense’s capability, and real-life wars, while advantaged to the attacker, isn’t so offense advantaged that defense is pointless:
    
    https://www.lesswrong.com/posts/xiRfJApXGDRsQBhvc/we-might-be-dropping-the-ball-on-autonomous-replication-and-1#hXwGKTEQzRAcRYYBF
    - Seth Herd 23 Aug 2024 17:23 UTC
      9 points
      7
      Parent
      The issue is that, if you can hide, you can amass resources exponentially once you hit self-replicating production facilities and fully recursively self-improving AGI. This almost completely shifts the logic of all previous conflicts.
      
      The comment you link seems to be addressing a very different scenario than my primary concern. It’s addressing an attack from within human infrastructure, rather than outside. What I describe is often not considered, because it seems like the “far future” that we needn’t worry about yet. But that far future seems realistically to be a handful of years past human-level AGI that starts to rapidly develop new technologies like the robotics needed for an autonomous self-replicating production in remote locations.
      - Noosphere89 23 Aug 2024 17:36 UTC
        6 points
        0
        Parent
        Then it reduces to “I think the exponential growth of resources is avaliable to both the attackers and defense, such that even while everything is changing, the relative standing of the attack/defense balance doesn’t change.”
        
        I think part of why I’m skeptical is the assumption that exponential growth is only useful for attack, or at least way more useful for attack, whereas I think exponentially growing resources by AI tech is way more symmetrical by default.
        Seth Herd 26 Aug 2024 20:39 UTC
        2 points
        0
        Parent
        Ah—now I see your point. This will help me clarify my concern in future presentations, so thanks!
        
        My concern is that a bad actor will be the first to go all-out exponential. Other, better humans in charge of AGI will be reluctant to turn the moon much less the earth into military/industrial production, and to upend the power structure of the world. The worst actors will, by default, be the first go full exponential and ruthlessly offensive.
        
        Beyond that, I’m afraid the physics of the world does favor offense over defense. It’s pretty easy to release a lot of energy where you want it, and very hard to build anything that can withstand a nuke let alone a nova.
        
        But the dynamics are more complex than that, of course. So I think the reality is unknown. My point is that this scenario deserves some more careful thought.
        Noosphere89 26 Aug 2024 21:01 UTC
        6 points
        0
        Parent
        Yeah, it does deserve more careful thought, especially since I expect almost all of my probability mass on catastrophe to be human caused, and more importantly I still think that it’s an important enough problem that resources should go to thinking about it.
        otto.barten 6 Jan 2025 11:35 UTC
        1 point
        0
        Parent
        Offense/defense balance is such a giant crux for me. I would take quite different actions if I saw plausible arguments that defense will win over offense. I’m astonished that I don’t know any literature on this. Large parts of the space seem to be quite strongly convinced that offense will win or defense will win (at least, else their actions don’t make sense to me), but I’ve very rarely seen this assumption debated explicitly. It would really be very helpful if someone could point me to sources. Right now I have a twitter poll with 30 votes (result: offense wins) and an old LW post to go by.