Daniel Kokotajlo comments on My disagreements with “AGI ruin: A List of Lethalities”

Daniel Kokotajlo 25 Sep 2024 17:01 UTC
10 points
2
I think this is wrong, and a lot of why I disagree with the pivotal act framing is probably due to disagreeing with the assumption that future technology will be radically biased towards to offense, and while I do think biotechnology is probably pretty offense-biased today, I also think it’s tractable to reduce bio-risk without trying for pivotal acts.
Also, I think @evhub’s point about homogeneity of AI takeoff bears on this here, and while I don’t agree with all the implications, like there being no warning shot for deceptive alignment (because of synthetic data), I think there’s a point in which a lot of AIs are very likely to be very homogenous, and thus break your point here:

I think it depends on how we interpret Yudkowsky. If we interpret him as saying ‘Even if we get aligned AGI, we need to somehow stop other people from building unaligned AGI’ then yeah, it’s a question of offense-defense balance and homogeneity etc. However, if we interpret him as saying ‘We’ll probably need to proceed cautiously, ramping up the capabilities of our AIs at slower-than-maximum speed, in order to be safe—but that means someone cutting corners on safety will surpass us, unless we stop them’ then offense-defense and homogeneity aren’t the crux. And I do interpret him the second way.

That said, I also probably disagree with Yudkowsky here in the sense that I think that we don’t need powerful AI systems to carry out the most promising ‘pivotal act’ (i.e. first domestic regulation, then international treaty, to ensure AGI development proceeds cautiously.)
- Noosphere89 25 Sep 2024 17:19 UTC
  2 points
  0
  Parent
  I admit, I was interpreting him in the first sense, that even if we got an aligned AGI, we would need to stop others from building unaligned AGIs, but I also see your interpretation as plausible too, and under this model, I agree that we’d ideally like to not have a maximum-speed race, and go somewhat slower as we get closer to AGI and ASI.
  
  I think a maximum sprint to get more capabilities is also quite bad, though conditional on that happening, I don’t think we’d automatically be doomed, and there’s a non-trivial, but far too low chance that everything works out.
  - Daniel Kokotajlo 25 Sep 2024 20:42 UTC
    4 points
    0
    Parent
    Cool. Then I think we are in agreement; I agree with everything you’ve just said. (Unfortunately I think that when it matters most, around the time of AGI, we’ll be going at close-to-maximum speed, i.e. we’ll be maybe delaying the creation of superintelligence by like 0 − 6 months relative to if we were pure accelerationists.)
    - Noosphere89 25 Sep 2024 21:08 UTC
      4 points
      0
      Parent
      How fast do you think that the AI companies could race from AGI to superintelligence assuming no regulation or constraints on their behavior?
      - Daniel Kokotajlo 26 Sep 2024 0:13 UTC
        2 points
        0
        Parent
        Depends on the exact definitions of both. Let’s say AGI = ‘a drop-in substitute for an OpenAI research engineer’ and ASI = ’Qualitatively at least as good as the best humans at every cognitive task; qualitatively superior on many important cognitive tasks; also, at least 10x faster than humans; also, able to run at least 10,000 copies in parallel in a highly efficient organizational structure (at least as efficient as the most effective human organizations like SpaceX)”
        
        In that case I’d say probably about eight months? Idk. Could be more like eight weeks.