Daniel Kokotajlo comments on Linkpost: Robin Hanson—Why Not Wait On AI Risk?

Daniel Kokotajlo Jun 28, 2022, 5:51 PM
9 points
Thanks for reading! No existential catastrophe has happened yet in that scenario. The catastrophe will probably happen in 2027-2028 or so if the scenario continues the way I think is most plausible. I’m sorry that I never got around to writing that part, if I had, this would be a much more efficient conversation! Basically to find our disagreement (assuming you agree my story up till 2026 is plausible) we need to discuss what the plausible continuations to the story are.

Surprise surprise, I think that there are plausible (indeed most-plausible) continuations of the story in which the bigass foundation model chatbots (which, in the story, are misaligned) as they scale up become strategically aware, agentic planners with superhuman persuasion and coding ability by around 2027. From there (absent clever intervention from prepared people) they gradually accumulate power and influence, slowly at first and then quickly as the R&D acceleration gets going. Maybe singularity happens around 2030ish, but well before that point the long-term trajectory of the world is firmly in the grip of some coalition of AIs.

Do you think (a) that the What 2026 Looks Like story is not plausible, (b) that the continuation of it I just sketched is not plausible, or (c) that it’s not worth worrying about or preparing for? I’m guessing (b). Again it’s unfortunate that I haven’t finished the story, if I had we’d have more detailed things to disagree on.

ETA: Since I doubt you have time or interest to engage with me at length about the 2026 scenario, what do you think about the bulk of my criticism above? About coordination ability? Both of us like to analogize AI stuff to historical stuff, it seems we differ in which historical examples we draw from—for you, the steam engine and other bits of machinery, for me, colonial conquests and revolutions.
- RobinHanson Jun 30, 2022, 6:21 PM
  13 points
  Parent
  It seems the key feature of this remaining story is the “coalition of AIs” part. I can believe that AIs would get powerful, what I’m skeptical about is the claim that they naturally form a coalition of them against us. Which is also what I object to in your prior comments. Horses are terrible at coordination compared to humans, and humans weren’t built by horses and integrated into a horse society, with each human originally in the service of a particular horse.
  - Daniel Kokotajlo Jul 1, 2022, 1:46 AM
    7 points
    Parent
    Yes, horses are terrible at coordination compared to humans, and that’s a big part of why they lost. At some point in prehistory the horses could have coordinated to crush the humans, and didn’t. Even in the 1900′s horses could have gone on strike, gone to war, etc. and negotiated better working conditions etc. but didn’t becuase they weren’t smart enough.
    
    Similarly, humans are terrible at coordination compared to AIs.
    
    I agree that the fact that humans build the AIs is a huge advantage. But it’s only a huge advantage insofar as we solve alignment or otherwise limit AI capabilities in ways that serve our interests; if instead we YOLO it or half-ass it, and end up with super capable unaligned agentic AGIs… then we squandered our advantage.
    
    I don’t think the integration into society or service relationship thing matters much. Imagine the following fictional case:
    
    The island of Atlantis is in the mid-Atlantic. It’s an agrarian empire at the tech level of the ancient Greeks. Atlantis also contains huge oil, coal, gold, and rare earth deposits.
    
    In 1500 Atlantis is discovered by European explorers, who set up trading posts along the coast. For some religious reason, Atlanteans decide to implement the following strict laws: (a) No Atlantean can learn any language other than Atlantean. (b) It is heresy, thoughtcrime, for any Atlantean to understand European ideas. You can use European technology (guns, etc.) but you can’t know how it works and you certainly can’t know how to build it yourself, because that would mean you’ve been infected by foreign ideas.
    
    (The point of these restrictions is to mimic how in humans vs. AI, the humans won’t be able to keep up with AIs in capabilities no matter how hard they try. They can use the fancy gadgets and apps the AIs build, but they can’t contribute to frontier R&D. By contrast in historical cases of colonialism and conquest, the technologically weaker side can in principle learn the scientific and economic methods and then catch up to the stronger side.)
    
    Also, the Atlantean coast is a giant cliff along its entire length. It’s essentially impossible to assault Atlantis from the sea, therefore; you need the permission of the locals to land, no matter how large your army is and no matter how advanced your technology is.
    
    Anyhow. Suppose at first the Europeans are too weak to conquer Atlantis and so instead they trade peacefully; Atlantis in fact purchases large quantities of European slaves and hires large quantities of European entrepreneurs and craftsmen and servants. Because of (a) and (b) and the amazing technology and trade goods the Europeans bring, demand for European labor and goods is high and the population of Europeans and their goods grows exponentially. The tech level also rises dramatically until it catches up with the global frontier.
    
    Eventually by 1800 native Atlanteans are outnumbered 100 to 1, and also pretty much the whole economy is run by Europeans, not just at the menial level but at the managerial level and R&D level as well, due to the laws a and b. Also, while the fancy weapons and tools are now held by Atlanteans and Europeans alike, the Europeans are much better at using them since only they are allowed to understand them and build them...
    
    Yet (if I continue the story the way I think you think it should be continued) the Atlanteans remain in control and don’t suffer any terrible calamities, certainly not a coup or anything like that, because the transition was gradual and the Europeans began in a position of servitude and integration in the economy on terms of native Atlantean choosing… Basically, all the arguments you make about why AI coups aren’t a concern, apply to this hypothetical scenario as arguments for why European coups aren’t a concern. And ditto for the more general arguments about loss of control.
    
    Anyhow, this all seems super implausible to me, based on my reading of history. Atlanteans in 1500 should be scared that they’ll be disenfranchised and/or killed long before 1800; they should not listen to Hansonian prophets in their midst.
    - aog Jul 9, 2022, 5:51 AM
      1 point
      Parent
      Similarly, humans are terrible at coordination compared to AIs.
      
      Are there any key readings you could share on this topic? I’ve come across arguments about AIs coordinating via DAOs or by reading each others’ source code, including in Andrew Critch’s RAAP. Is there any other good discussion of the topic?
      - Daniel Kokotajlo Jul 9, 2022, 10:52 AM
        5 points
        Parent
        Unfortunately I don’t know of a single reading that contains all the arguments.
        This post is relevant and has some interesting discussion below IIRC.
        Mostly I think the arguments for AIs being superior at coordination are:
        1. It doesn’t seem like humans are near-optimal at coordination, so AIs will eventually be superior to humans at cooperation just like they will eventually be superior in most other things that humans can do but not near-optimally.
        2. We can think of various fancy methods (such as involving reading source code, etc.) that AIs might use that humans don’t or can’t.
        3. There seems to be a historical trend of increasing coordination ability / social tech, we should expect it to continue with AI.
        4. Even if we just model AIs as somewhat smarter, more agentic, more rational humans… it still seems like that would probably be enough. Humans have coordinated coups and uprisings successfully before, if we imagine the conspirators are all mildly superhuman...
        
        I think it might be possible to design an AI scheme in which the AIs don’t coordinate with each other even though it would be in their interest. E.g. perhaps if they are trained to be bad at coordinating, strongly rewarded for defecting on each other, etc. But I don’t think it’ll happen by default.