Here is my snapshot. My reasoning is basically similar to Ethan Perez’, it’s just that I think that if transformative AI is achievable in the next five orders of magnitude of compute improvement (e.g. prosaic AGI?), it will likely be achieved in the next five years or so. I also am slightly more confident that it is, and slightly less confident that TAI will ever be achieved.
I am aware that my timelines are shorter than most… Either I’m wrong and I’ll look foolish, or I’m right and we’re doomed. Sucks to be me. [Edited the snapshot slightly on 8/23/2020] [Edited to add the following powerpoint slide that gets a bit more at my reasoning]
I’d be happy to bet on this. I already have a 10:1 bet with someone to the tune of $1000/$100.
However, I’d be even happier to just have a chat with you and discuss models/evidence. I could probably be updated away from my current position fairly easily. It looks like your distribution isn’t that different from my own though? [EDIT 8/26/2020: Lest I give the wrong impression, I have thought about this a fair amount and as a result don’t expect to update substantially away from my current position. There are a few things that would result in substantial updates, which I could name, but mostly I’m expecting small updates.]
I’d be down to chat with you :) although plotting it I wonder that maybe I don’t have that much to say.
I think the main differences are that (a) you’re assigning much more in the next 10 years, and (b) you’re assigning way less worlds where it’s just harder, takes more effort, but overall we’re still on the right path.
My strawman is that mine is yours plus planning fallacy. I feel like my crux between us something like “I have a bunch of probability on (even though we’re on the right track) it just having a lot of details and requiring human coordination of big projects that we’re not great at right now” but that sounds very vague and uncompelling.
Added: I think you’re focused on the scaling hypothesis being true. Given that, how do you feel about this story:
Over the next five years we scale up AI compute and use up all the existing overhang, and this doesn’t finish it but it provides strong evidence that we’re nearly there. This makes us confident that if we scaled it up another three orders of magnitude that we’d get AGI, and then we figure out how to get there faster using lots of fine-grained optimisation, so we raise funds to organise a Manhatten project with 1000s of scientists, and this takes 5 years to set up and 10 years to execute. Oh, and also there’s a war over control of the compute and so on that makes things difficult.
In a world like this, I feel like it’s bottlenecked by our ability to organise large scale projects, which I have smoother uncertainty bars over than spikey. (Like, how long did the large hadron collider take to build?) Does that sound plausible to you, that it’s bottlenecked by large scale human projects?
Oh, whoa, OK I guess I was looking at your first forecast, not your second. Your second is substantially different. Yep let’s talk then? Want to video chat someday?
I tried to account for the planning fallacy in my forecast, but yeah I admit I probably didn’t account for it enough. Idk.
My response to your story is that yeah, that’s a possible scenario, but it’s a “knife edge” result. It might take <5 OOMs more, in which case it’ll happen with existing overhang. Or it’ll take >7 OOMs more compute, in which case it’ll not happen until new insights/paradigms are invented. If it takes 5-7 OOMs more, then yeah, we’ll first burn through the overhang and then need to launch some huge project in order to reach AGI. But that’s less likely than the other two scenarios.
(I mean, it’s not literally knife’s edge. It’s probably about as likely as the we-get-AGI-real-soon scenario. But then again I have plenty of probability mass around 2030, and I think 10 years from now is plenty of time for more Manhattan projects.)
It’s been a year, what do my timelines look like now?
My median has shifted to the left a bit, it’s now 2030. However, I have somewhat less probability in the 2020-2025 range I think, because I’ve become more aware of the difficulties in scaling up compute. You can’t just spend more money. You have to do lots of software engineering and for 4+ OOMs you literally need to build more chip fabs to produce more chips. (Also because 2020 has passed without TAI/AGI/etc., so obviously I won’t put as much mass there...)
So if I were to draw a distribution it would look pretty similar, just a bit more extreme of a spike and the tip of the spike might be a bit to the right.
You have to do lots of software engineering and for 4+ OOMs you literally need to build more chip fabs to produce more chips.
I have probably missed many considerations you have mentioned elsewhere, but in terms of software engineering, how do you think the “software production rate” for scaling up large evolved from 2020 to late 2021? I don’t see why we couldn’t get 4 OOM between 2020 and 2025.
If we just take the example of large LM, we went from essentially 1-10 publicly known models in 2020, to 10-100 in 2021 (cf. China, Korea, Microsoft, DM, etc.), and I expect the amount of private models to be even higher, so it makes sense to me that we could have 4OOM more SWE in that area by 2025.
Now, for the chip fabs, I feel like one update from 2020 to 2022 has been NVIDIA & Apple doing unexpected hardware advances (A100, M1) and Nvidia stock growing massively, so I would be more optimistic about “build more fabs” than in 2020. Though I’mm not an expert in hardware at all and those two advances I mentioned were maybe not that useful for scaling.
If I understand you correctly, you are asking something like: How many programmer-hours of effort and/or how much money was being spent specifically on scaling up large models in 2020? What about in 2025? Is the latter plausibly 4 OOMs more than the former? (You need some sort of arbitrary cutoff for what counts as large. Let’s say GPT-3 sized or bigger.)
Yeah maybe, I don’t know! I wish I did. It’s totally plausible to me that it could be +4 OOMs in this metric by 2025. It’s certainly been growing fast, and prior to GPT-3 there may not have been much of it at all.
Yes, something like: given (programmer-hours-into-scaling(July 2020) - programmer-hours-into-scaling(Jan 2022)), and how much progress there has been on hardware for such training (I don’t know the right metric for this, but probably something to do with FLOP and parallelization), the extrapolation to 2025 (either linear or exponential) would give the 4 OOM you mentioned.
Eyeballing the graph in light of the fact that the 50th percentile is 2034.8, it looks like P(AGI | no AGI before 2040) is about 30%. Maybe that’s too low, but it feels about right to me, unfortunately. 20 years from now, science in general (or at least in AI research) may have stagnated, with Moore’s Law etc. ended and a plateau in new AI researchers. Or maybe a world war or other disaster will have derailed everything. Etc. Meanwhile 20 years is plenty of time for powerful new technologies to appear that accelerate AI research.
This is my all things considered forecast, though I’m open to the idea that I should weight other people’s opinions more than I do. It’s not that different from my inside view forecast, i.e. I haven’t modified it much in light of the opinions of others. I haven’t tried to graph my inside view, it would probably look similar to this only with a higher peak and a bit less probability mass in the “never” category and in the “more than 20 years” region.
I don’t literally think we are doomed. I’m just rather pessimistic about our chances of aligning AI if it is happening in the next 5 years or so.
My confidence in prosaic AGI is 30% to Ethan’s 25%, and my confidence in “more than 2100” is 15% to Ethan’s… Oh wait he has 15% too, huh. I thought he had less.
Here is my snapshot. My reasoning is basically similar to Ethan Perez’, it’s just that I think that if transformative AI is achievable in the next five orders of magnitude of compute improvement (e.g. prosaic AGI?), it will likely be achieved in the next five years or so. I also am slightly more confident that it is, and slightly less confident that TAI will ever be achieved.
I am aware that my timelines are shorter than most… Either I’m wrong and I’ll look foolish, or I’m right and we’re doomed. Sucks to be me.
[Edited the snapshot slightly on 8/23/2020]
[Edited to add the following powerpoint slide that gets a bit more at my reasoning]
I feel like taking some bets against this, it’s very extreme.
I’d be happy to bet on this. I already have a 10:1 bet with someone to the tune of $1000/$100.
However, I’d be even happier to just have a chat with you and discuss models/evidence. I could probably be updated away from my current position fairly easily. It looks like your distribution isn’t that different from my own though? [EDIT 8/26/2020: Lest I give the wrong impression, I have thought about this a fair amount and as a result don’t expect to update substantially away from my current position. There are a few things that would result in substantial updates, which I could name, but mostly I’m expecting small updates.]
Me: Wonders how much disagreement we have.
Me: Plots it.
I’d be down to chat with you :) although plotting it I wonder that maybe I don’t have that much to say.
I think the main differences are that (a) you’re assigning much more in the next 10 years, and (b) you’re assigning way less worlds where it’s just harder, takes more effort, but overall we’re still on the right path.
My strawman is that mine is yours plus planning fallacy. I feel like my crux between us something like “I have a bunch of probability on (even though we’re on the right track) it just having a lot of details and requiring human coordination of big projects that we’re not great at right now” but that sounds very vague and uncompelling.
Added: I think you’re focused on the scaling hypothesis being true. Given that, how do you feel about this story:
In a world like this, I feel like it’s bottlenecked by our ability to organise large scale projects, which I have smoother uncertainty bars over than spikey. (Like, how long did the large hadron collider take to build?) Does that sound plausible to you, that it’s bottlenecked by large scale human projects?
Oh, whoa, OK I guess I was looking at your first forecast, not your second. Your second is substantially different. Yep let’s talk then? Want to video chat someday?
I tried to account for the planning fallacy in my forecast, but yeah I admit I probably didn’t account for it enough. Idk.
My response to your story is that yeah, that’s a possible scenario, but it’s a “knife edge” result. It might take <5 OOMs more, in which case it’ll happen with existing overhang. Or it’ll take >7 OOMs more compute, in which case it’ll not happen until new insights/paradigms are invented. If it takes 5-7 OOMs more, then yeah, we’ll first burn through the overhang and then need to launch some huge project in order to reach AGI. But that’s less likely than the other two scenarios.
(I mean, it’s not literally knife’s edge. It’s probably about as likely as the we-get-AGI-real-soon scenario. But then again I have plenty of probability mass around 2030, and I think 10 years from now is plenty of time for more Manhattan projects.)
Let’s do it. I’m super duper busy, please ping me if I’ve not replied here within a week.
Ping?
Sounds good. Also, check out the new image I added to my answer! This image summarizes the weightiest model in my mind.
It’s been a year, what do my timelines look like now?
My median has shifted to the left a bit, it’s now 2030. However, I have somewhat less probability in the 2020-2025 range I think, because I’ve become more aware of the difficulties in scaling up compute. You can’t just spend more money. You have to do lots of software engineering and for 4+ OOMs you literally need to build more chip fabs to produce more chips. (Also because 2020 has passed without TAI/AGI/etc., so obviously I won’t put as much mass there...)
So if I were to draw a distribution it would look pretty similar, just a bit more extreme of a spike and the tip of the spike might be a bit to the right.
I have probably missed many considerations you have mentioned elsewhere, but in terms of software engineering, how do you think the “software production rate” for scaling up large evolved from 2020 to late 2021? I don’t see why we couldn’t get 4 OOM between 2020 and 2025.
If we just take the example of large LM, we went from essentially 1-10 publicly known models in 2020, to 10-100 in 2021 (cf. China, Korea, Microsoft, DM, etc.), and I expect the amount of private models to be even higher, so it makes sense to me that we could have 4OOM more SWE in that area by 2025.
Now, for the chip fabs, I feel like one update from 2020 to 2022 has been NVIDIA & Apple doing unexpected hardware advances (A100, M1) and Nvidia stock growing massively, so I would be more optimistic about “build more fabs” than in 2020. Though I’mm not an expert in hardware at all and those two advances I mentioned were maybe not that useful for scaling.
If I understand you correctly, you are asking something like: How many programmer-hours of effort and/or how much money was being spent specifically on scaling up large models in 2020? What about in 2025? Is the latter plausibly 4 OOMs more than the former? (You need some sort of arbitrary cutoff for what counts as large. Let’s say GPT-3 sized or bigger.)
Yeah maybe, I don’t know! I wish I did. It’s totally plausible to me that it could be +4 OOMs in this metric by 2025. It’s certainly been growing fast, and prior to GPT-3 there may not have been much of it at all.
Yes, something like: given (programmer-hours-into-scaling(July 2020) - programmer-hours-into-scaling(Jan 2022)), and how much progress there has been on hardware for such training (I don’t know the right metric for this, but probably something to do with FLOP and parallelization), the extrapolation to 2025 (either linear or exponential) would give the 4 OOM you mentioned.
Blast from the past!
I’m biased but I’m thinking this “33% by 2026” forecast is looking pretty good.
Is your P(AGI | no AGI before 2040) really that low?
Eyeballing the graph in light of the fact that the 50th percentile is 2034.8, it looks like P(AGI | no AGI before 2040) is about 30%. Maybe that’s too low, but it feels about right to me, unfortunately. 20 years from now, science in general (or at least in AI research) may have stagnated, with Moore’s Law etc. ended and a plateau in new AI researchers. Or maybe a world war or other disaster will have derailed everything. Etc. Meanwhile 20 years is plenty of time for powerful new technologies to appear that accelerate AI research.
Is this your inside view, or your “all things considered” forecast? I.e., how do you update, if at all, on other people disagreeing with you?
This is my all things considered forecast, though I’m open to the idea that I should weight other people’s opinions more than I do. It’s not that different from my inside view forecast, i.e. I haven’t modified it much in light of the opinions of others. I haven’t tried to graph my inside view, it would probably look similar to this only with a higher peak and a bit less probability mass in the “never” category and in the “more than 20 years” region.
I’m somewhat confused as to how slightly more confident, and slightly less confident equate to doom- which is a pretty strong claim imo.
I don’t literally think we are doomed. I’m just rather pessimistic about our chances of aligning AI if it is happening in the next 5 years or so.
My confidence in prosaic AGI is 30% to Ethan’s 25%, and my confidence in “more than 2100” is 15% to Ethan’s… Oh wait he has 15% too, huh. I thought he had less.
I updated to 15% (from 5%) after some feedback so you’re right that I had less originally :)
Just as well, I’m also less confident that shorter timelines are congruent with a high, irreducible probability of failure.
EDIT: If doom is instead congruent simply with the advent of Prosaic AGI, then I still disagree- even moreso, actually.