Great post! I think many of the things you say apply equally well to broader categories of scenario too, e.g. your AGI risk model stuff works (with some modification) for different AGI development models than the one you gave. I’d love to see people spell that out, lest skeptics read this post and reply “but that’s not how AGI will be made, therefore this isn’t a serious problem.”
Assuming slow takeoff (again, fast takeoff is even worse), it seems to me that under these assumptions there would probably be a series of increasingly-worse accidents spread out over a decade or two, culminating in irreversible catastrophe, with humanity unable to coordinate to avoid that outcome—due to the coordination challenges in Assumptions 2-4.
This seems too optimistic to me. Even on slow takeoff, things won’t take more than a decade. (Paul is Mr. Slow Takeoff and even he seems to think it would be more like a decade) Even if a slow takeoff takes more than a decade, the accidents wouldn’t be spread out that much. Early AI systems will be too stupid to do anything that counts as an accident in the relevant sense (people will just think of it as like the tesla self-driving car crashes, or the various incidents of racial bias in image recognition AI) and later AI systems will be smart enough to be strategic, waiting to strike until the right moment when they can actually succeed instead of just causing an “accident.” (They might do other, more subtle things prior to that time, but they would be subtle, and thus not “accidents” in the relevant sense. They wouldn’t be fire alarms, for example.) Or maybe I am misunderstanding what you mean by accidents?
I haven’t thought very much about takeoff speeds (if that wasn’t obvious!). But I don’t think it’s true that nobody thinks it will take more than a decade… Like, I don’t think Paul Christiano is the #1 slowest of all slow-takeoff advocates. Isn’t Robin Hanson slower? I forget.
Then a different question is “Regardless of what other people think about takeoff speeds, what’s the right answer, or at least what’s plausible?” I don’t know. A key part is: I’m hazy on when you “start the clock”. People were playing with neural networks in the 1990s but we only got GPT-3 in 2020. What were people doing all that time?? Well mostly, people were ignoring neural networks entirely, but they were also figuring out how to put them on GPUs, and making frameworks like TensorFlow and PyTorch and making them progressively easier to use and scale and parallelize, and finding all the tricks like BatchNorm and Xavier initialization and Transformers, and making better teaching materials and MOOCs to spread awareness of how these things work, developing new and better chips tailored to these algorithms (and vice-versa), waiting on Moore’s law, and on and on. I find it conceivable that we could get “glimmers of AGI” (in some relevant sense) in algorithms that have not yet jumped through all those hoops, so we’re stuck with kinda toy examples for quite a while as we develop the infrastructure to scale these algorithms, the bag of tricks to make them run better, the MOOCs, the ASICs, and so on. But I dunno.
Or maybe I am misunderstanding what you mean by accidents?
Yeah, sorry, when I said “accidents” I meant “the humans did something by accident”, not “the AI did something by accident”.
Thanks! Yeah, there are plenty of people who think takeoff will take more than a decade—but I guess I’ll just say, I’m pretty sure they are all wrong. :) But we should take care to define what the start point of takeoff is. Traditionally it was something like “When the AI itself is doing most of the AI research,” but I’m very willing to consider alternate definitions. I certainly agree it might take more than 10 years if we define things in such a way that takeoff has already begun.
Yeah, sorry, when I said “accidents” I meant “the humans did something by accident”, not “the AI did something by accident”.
Wait, uhoh, I didn’t mean “the AI did something by accident” either… can you elaborate? By “accident” I thought you meant something like “Small-scale disasters, betrayals, etc. caused by AI that are shocking enough to count as warning shots / fire alarms to at least some extent.”
Oh sorry, I misread what you wrote. Sure, maybe, I dunno. I just edited the article to say “some number of years”.
I never meant to make a claim “20 years is definitely in the realm of possibility” but rather to make a claim “even if it takes 20 years, that’s still not necessarily enough to declare that we’re all good”.
I never meant to make a claim “20 years is definitely in the realm of possibility” but rather to make a claim “even if it takes 20 years, that’s still not necessarily enough to declare that we’re all good”.
Great post! I think many of the things you say apply equally well to broader categories of scenario too, e.g. your AGI risk model stuff works (with some modification) for different AGI development models than the one you gave. I’d love to see people spell that out, lest skeptics read this post and reply “but that’s not how AGI will be made, therefore this isn’t a serious problem.”
This seems too optimistic to me. Even on slow takeoff, things won’t take more than a decade. (Paul is Mr. Slow Takeoff and even he seems to think it would be more like a decade) Even if a slow takeoff takes more than a decade, the accidents wouldn’t be spread out that much. Early AI systems will be too stupid to do anything that counts as an accident in the relevant sense (people will just think of it as like the tesla self-driving car crashes, or the various incidents of racial bias in image recognition AI) and later AI systems will be smart enough to be strategic, waiting to strike until the right moment when they can actually succeed instead of just causing an “accident.” (They might do other, more subtle things prior to that time, but they would be subtle, and thus not “accidents” in the relevant sense. They wouldn’t be fire alarms, for example.) Or maybe I am misunderstanding what you mean by accidents?
I haven’t thought very much about takeoff speeds (if that wasn’t obvious!). But I don’t think it’s true that nobody thinks it will take more than a decade… Like, I don’t think Paul Christiano is the #1 slowest of all slow-takeoff advocates. Isn’t Robin Hanson slower? I forget.
Then a different question is “Regardless of what other people think about takeoff speeds, what’s the right answer, or at least what’s plausible?” I don’t know. A key part is: I’m hazy on when you “start the clock”. People were playing with neural networks in the 1990s but we only got GPT-3 in 2020. What were people doing all that time?? Well mostly, people were ignoring neural networks entirely, but they were also figuring out how to put them on GPUs, and making frameworks like TensorFlow and PyTorch and making them progressively easier to use and scale and parallelize, and finding all the tricks like BatchNorm and Xavier initialization and Transformers, and making better teaching materials and MOOCs to spread awareness of how these things work, developing new and better chips tailored to these algorithms (and vice-versa), waiting on Moore’s law, and on and on. I find it conceivable that we could get “glimmers of AGI” (in some relevant sense) in algorithms that have not yet jumped through all those hoops, so we’re stuck with kinda toy examples for quite a while as we develop the infrastructure to scale these algorithms, the bag of tricks to make them run better, the MOOCs, the ASICs, and so on. But I dunno.
Yeah, sorry, when I said “accidents” I meant “the humans did something by accident”, not “the AI did something by accident”.
Thanks! Yeah, there are plenty of people who think takeoff will take more than a decade—but I guess I’ll just say, I’m pretty sure they are all wrong. :) But we should take care to define what the start point of takeoff is. Traditionally it was something like “When the AI itself is doing most of the AI research,” but I’m very willing to consider alternate definitions. I certainly agree it might take more than 10 years if we define things in such a way that takeoff has already begun.
Wait, uhoh, I didn’t mean “the AI did something by accident” either… can you elaborate? By “accident” I thought you meant something like “Small-scale disasters, betrayals, etc. caused by AI that are shocking enough to count as warning shots / fire alarms to at least some extent.”
Oh sorry, I misread what you wrote. Sure, maybe, I dunno. I just edited the article to say “some number of years”.
I never meant to make a claim “20 years is definitely in the realm of possibility” but rather to make a claim “even if it takes 20 years, that’s still not necessarily enough to declare that we’re all good”.
Ah, OK. We are on the same page then.