A: As background, I think it’s obvious that there will eventually be “transformative AI” (TAI) that would radically change the world.[1]
I’m interested in what this TAI will eventually look like algorithmically. Let’s list some possibilities:
A “Large Language Model (LLM) plateau-ist” would be defined as someone who thinks that categories (A-B), and usually also (C), will plateau in capabilities before reaching TAI levels.[2] I am an LLM plateau-ist myself.[3]
I’m not going to argue about whether LLM-plateau-ism is right or wrong—that’s outside the scope of this post, and also difficult for me to discuss publicly thanks to infohazard issues.[4] Oh well, we’ll find out one way or the other soon enough.
In the broader AI community, both LLM-plateau-ism and its opposite seem plenty mainstream. Different LLM-plateau-ists have different reasons for holding this belief. I think the two main categories are:
Theoretical—maybe they have theoretical beliefs about what is required for TAI, and they think that LLMs just aren’t built right to do the things that TAI would need to do.
Empirical—maybe they’re not very impressed by the capabilities of current LLMs. Granted, future LLMs will be better than current ones. But maybe they have extrapolated that our planet will run out of data and/or compute before LLMs get all the way up to TAI levels.
Q: If LLMs will plateau, then does that prove that all the worry about AI x-risk is wrong and stupid?
A: No no no, a million times no, and I’m annoyed that this misconception is so rampant in public discourse right now.
(Side note to AI x-risk people: If you have high credence that AI will kill everyone but only medium credence that this AI will involve LLMs, then maybe consider trying harder to get that nuance across in your communications. E.g. Eliezer Yudkowsky is in this category, I think.)
A couple random examples I’ve seen of people failing to distinguish “AI may kill everyone” from “…and that AI will definitely be an LLM”:
Venkatesh Rao’s blog post “Beyond Hyperanthropomorphism” goes through an elaborate 7000-word argument that eventually culminates, in the final section, in his assertion that a language model trained on internet data won’t be a powerful agent that gets things done in the world, but if we train an AI with a robot body, then it could be a powerful agent that gets things done in the world. OK fine, let’s suppose for the sake of argument he’s right that robot bodies will be necessary for TAI.[5] Then people are obviously going to build those AIs sooner or later, right? So let’s talk about whether they will pose an x-risk. But that’s not what Venkatesh does. Instead he basically treats “they will need robot bodies” as the triumphant conclusion, more-or-less sufficient in itself to prove that AI x-risk discourse is stupid.
Sarah Constantin’s blog post entitled “Why I am not an AI doomer” states right up front that she agrees “1. Artificial general intelligence is possible in principle … 2, Artificial general intelligence, by default, kills us all … 3. It is technically difficult, and perhaps impossible, to ensure an AI values human life.” She only disagrees with the claim that this will happen soon, and via scaling LLMs. I think she should have picked a different title for her post!!
(I’ve seen many more examples on Twitter, reddit, comment threads, etc.)
Anyway, if you think LLMs will plateau, then you can probably feel confident that we won’t get TAI imminently (see below), but I don’t see why you would have much more confidence that TAI will go well for humanity. In fact, for my part, if I believed that (A)-type systems were sufficient for TAI—which I don’t—then I think I would feel slightly less concerned about AI x-risk than I actually do, all things considered!
The case that AI x-risk is a serious concern long predates LLMs. If you want a non-LLM-centric AI x-risk discussion, then most AI x-risk discussions ever written would qualify. I have one here, or see anything written more than a year or two ago, for example.
Q: Do the people professionally focused on AI x-risk tend to think LLMs will plateau? Or not?
I haven’t done a survey or anything, but I’ll give a hot-take gestalt impression. (This is mostly based on reading what everyone is writing, and occasionally chatting with people at conferences and online.)
Certainly, there is a mix of opinions. But one common pattern I’ve noticed recently (i.e. in the last year) [this certainly doesn’t apply to everyone] is that there’s often a weird disconnect where, when someone is asked explicitly, they claim to have lots of credence (even 50% or more) outside (C), but everything they say and do is exactly as if all of their credence is in (A-C)—indeed, often as if it’s all in (A).[6]
I think the charitable explanation for this discrepancy is that a TAI scenario within categories (A-C) is more urgent and tractable, and therefore we should focus conversation on that scenario, even while acknowledging the possibility that this scenario won’t happen because LLMs will plateau.
The uncharitable explanation is that some people’s professed beliefs are out-of-sync with their true beliefs—and if so, maybe they should sort that out.
Q: If LLMs will plateau, then can we rest assured that TAI is many decades away?
A: I think that I and my fellow LLM-plateau-ists can feel pretty good that TAI won’t happen in 2023. Probably not 2024 either. Once we start getting much further out than that, I think we should be increasingly uncertain.
The field of AI is wildly different today than it was 10 or 20 years ago. By the same token, even without TAI, we should expect that the field of AI will be wildly different 10 or 20 years into the future. I think 10 or 20 years is more than enough time for currently-underdeveloped (or even currently-nonexistent) AI techniques to be invented, developed, extensively iterated, refined, and scaled.
So even if you’re an LLM plateau-ist, I don’t think you get to feel super-confident that TAI won’t happen in the next 10 or 20 years. Maybe, maybe not. Nobody knows. Technological forecasting is very hard.
(By the way: Even if TAI were known to be many decades away, we should still be working frantically right now on AI alignment, for reasons here.)
(If you think TAI is definitely decades-to-centuries away because the human brain is super complicated, I have a specific response to that here & here.)
Q: If LLMs will plateau, how does that impact governance and “the pause”?
A: As above, I think very dangerous TAI will come eventually, and that we’re extremely not ready for it, and that we’re making slow but steady progress right now on getting ready, and so I’d much rather it come later than sooner. (So far this is independent of whether LLMs will plateau.) (More details and responses to common counter-arguments here.)
My take is: a “pause” in training unprecedentedly large ML models is probably good if TAI will look like (A-B), maybe good if TAI will look like (C), and probably counterproductive if TAI will be outside (C).
Why? The biggest problem in my mind is algorithmic progress. If we’re outside (C), then the “critical path to TAI” right now is algorithmic progress. Granted, scaling would need to happen at some point, but not yet, and perhaps not much scaling if any—I think there are strong reasons to expect that a GPT-4 level of scale (or even less, or even much much less) is plenty for TAI, given better algorithms.[7]
(Algorithmic progress is relevant less and less as we move towards (A), but still relevant to some extent even in (A).)
My guess is that scaling and algorithmic progress are currently trading off against each other, for various reasons. So interventions against scaling would cause faster algorithmic progress, which is bad from my perspective.
The obvious follow-up question is: “OK then how do we intervene to slow down algorithmic progress towards TAI?” The most important thing IMO is to keep TAI-relevant algorithmic insights and tooling out of the public domain (arxiv, github, NeurIPS, etc.). I appeal to AI researchers to not publicly disclose their TAI-relevant ideas, and to their managers to avoid basing salary, hiring, and promotion decisions on open publications.[8] Researchers in almost every other private-sector industry publish far less than do private-sector ML/AI researchers. Consider SpaceX, for example.
I will also continue to spend some of my time trying to produce good pedagogy about AI x-risk, and to engage in patient, good-faith arguments (as opposed to gotchas) when the subject comes up, and to do the research that may lead to more crisp and rigorous arguments for why AI doom is likely (if indeed it’s likely). I encourage others to continue doing all those things too.
To be sure, I don’t think “slow down algorithmic progress towards TAI by trying to win over AI researchers and ask nicely for a change to AI research culture” is an intervention that will buy us much time, but maybe a bit, and it’s obviously a good thing to do regardless, and anyway I don’t have any better ideas.
(Thanks Linda Linsefors & Seth Herd for critical comments on a draft.)
If that’s not obvious, consider (as an existence proof) that it will eventually be possible to run brain-like algorithms on computer chips, as smart and insightful as any human, but thinking 100× faster, and there could be trillions of them with trillions of teleoperated robot bodies and so on. More discussion here.
It’s quite possible that there is more than one viable path to TAI, in which case the question is: which one will happens first? I am implicitly assuming in this post that the (A-B) LLMs are far enough along (compared to other approaches) that either they will plateau soon or they will “win the race”. If you like, you can replace the phrase “LLMs will plateau” with the weaker “LLMs will plateau or at least revert to a much much slower rate of improvement such that other paths to TAI will happen instead.”
Stating my own opinions without justifying them: I put pretty low weight on (A-B), mainly for theoretical reasons. I’m nervous to confidently say that (C) won’t happen, because (C) is a broad category including lots of possibilities that have never occurred to me. But I put pretty low weight on it anyway. (After all, (D) & (E) include even more possibilities that have never occurred to me!) I put negligible weight on (F). I seem to be the only full-time AI x-risk researcher who treats (E) as the most likely possibility. Heck, maybe I’m the only full-time AI x-risk researcher who treats (E) as a possibility at all. (Human brains have “neural nets”, but they’re not “deep”, and they differ from DNNs in various other ways too. Experts disagree about whether any of those differences are important.) But (E)-versus-(D) is not too safety-relevant anyway, in my opinion—my research interest is in safety/alignment for model-based RL AGI, and model-based RL AGI could exist in (E) or (D) or (C), and it doesn’t matter too much which from a safety perspective, AFAICT.
Explanation for the unfamiliar: From my perspective, developing TAI soon is bad—see final section below. I have various idiosyncratic grand ideas about the requirements for powerful AI and limitations of LLMs. Maybe those ideas are wrong and stupid, in which case, it’s just as well that I don’t spread them. Or maybe they’re right, in which case, I also don’t want to spread them, because they could help make TAI come marginally sooner. (Spreading the ideas would also have various benefits, but I think the costs mostly outweigh the benefits here.)
For my part, I find it very hard to imagine that a literal robot body will be necessary to create an AI that poses an x-risk. I guess I’m somewhat open-minded to the possibility that a virtual robot body in a VR environment could be necessary (or at least helpful) during at least part of the training. And I think it’s quite likely that the AI needs to have an “action space” of some sort to reach TAI levels, even if it’s non-body-ish things like “virtually opening a particular text document to a particular page”.
For example, my sense is that some people will explicitly say “scale is all you need”, but many more people will effectively assume that “scale is all you need” when they’re guessing when TAI will arrive, what it will look like, what its alignment properties will be, how much compute it will involve, etc. To elaborate on one aspect of that, there’s been a lot of alignment-of-(A) discourse (simulators, waluigi, shoggoth memes, etc.), and to me it’s far from obvious that this discourse would continue to be centrally relevant in the broader category of (C). For example, it seems to me that there are at least some possible (C)-type systems for which we’ll need to be thinking about their safety using mostly “classic agent alignment discourse” (instrumental convergence, goal misgeneralization, etc.) instead.
For example, the calculations of a human brain entail fewer FLOP/s than a single good GPU (details). The brain might have more memory capacity than one GPU, although my current guess is it doesn’t; anyway, the brain almost definitely has less memory capacity than 1000 GPUs.
There are cases where public disclosure has specific benefits (helping safety / alignment) that outweighs its costs (making TAI come sooner). It can be a tricky topic. But I think many AI researchers don’t even see timeline-shortening as a cost at all, but rather a benefit.
AI doom from an LLM-plateau-ist perspective
(in the form of an FAQ)
Q: What do you mean, “LLM plateau-ist”?
A: As background, I think it’s obvious that there will eventually be “transformative AI” (TAI) that would radically change the world.[1]
I’m interested in what this TAI will eventually look like algorithmically. Let’s list some possibilities:
A “Large Language Model (LLM) plateau-ist” would be defined as someone who thinks that categories (A-B), and usually also (C), will plateau in capabilities before reaching TAI levels.[2] I am an LLM plateau-ist myself.[3]
I’m not going to argue about whether LLM-plateau-ism is right or wrong—that’s outside the scope of this post, and also difficult for me to discuss publicly thanks to infohazard issues.[4] Oh well, we’ll find out one way or the other soon enough.
In the broader AI community, both LLM-plateau-ism and its opposite seem plenty mainstream. Different LLM-plateau-ists have different reasons for holding this belief. I think the two main categories are:
Theoretical—maybe they have theoretical beliefs about what is required for TAI, and they think that LLMs just aren’t built right to do the things that TAI would need to do.
Empirical—maybe they’re not very impressed by the capabilities of current LLMs. Granted, future LLMs will be better than current ones. But maybe they have extrapolated that our planet will run out of data and/or compute before LLMs get all the way up to TAI levels.
Q: If LLMs will plateau, then does that prove that all the worry about AI x-risk is wrong and stupid?
A: No no no, a million times no, and I’m annoyed that this misconception is so rampant in public discourse right now.
(Side note to AI x-risk people: If you have high credence that AI will kill everyone but only medium credence that this AI will involve LLMs, then maybe consider trying harder to get that nuance across in your communications. E.g. Eliezer Yudkowsky is in this category, I think.)
A couple random examples I’ve seen of people failing to distinguish “AI may kill everyone” from “…and that AI will definitely be an LLM”:
Venkatesh Rao’s blog post “Beyond Hyperanthropomorphism” goes through an elaborate 7000-word argument that eventually culminates, in the final section, in his assertion that a language model trained on internet data won’t be a powerful agent that gets things done in the world, but if we train an AI with a robot body, then it could be a powerful agent that gets things done in the world. OK fine, let’s suppose for the sake of argument he’s right that robot bodies will be necessary for TAI.[5] Then people are obviously going to build those AIs sooner or later, right? So let’s talk about whether they will pose an x-risk. But that’s not what Venkatesh does. Instead he basically treats “they will need robot bodies” as the triumphant conclusion, more-or-less sufficient in itself to prove that AI x-risk discourse is stupid.
Sarah Constantin’s blog post entitled “Why I am not an AI doomer” states right up front that she agrees “1. Artificial general intelligence is possible in principle … 2, Artificial general intelligence, by default, kills us all … 3. It is technically difficult, and perhaps impossible, to ensure an AI values human life.” She only disagrees with the claim that this will happen soon, and via scaling LLMs. I think she should have picked a different title for her post!!
(I’ve seen many more examples on Twitter, reddit, comment threads, etc.)
Anyway, if you think LLMs will plateau, then you can probably feel confident that we won’t get TAI imminently (see below), but I don’t see why you would have much more confidence that TAI will go well for humanity. In fact, for my part, if I believed that (A)-type systems were sufficient for TAI—which I don’t—then I think I would feel slightly less concerned about AI x-risk than I actually do, all things considered!
The case that AI x-risk is a serious concern long predates LLMs. If you want a non-LLM-centric AI x-risk discussion, then most AI x-risk discussions ever written would qualify. I have one here, or see anything written more than a year or two ago, for example.
Q: Do the people professionally focused on AI x-risk tend to think LLMs will plateau? Or not?
I haven’t done a survey or anything, but I’ll give a hot-take gestalt impression. (This is mostly based on reading what everyone is writing, and occasionally chatting with people at conferences and online.)
Certainly, there is a mix of opinions. But one common pattern I’ve noticed recently (i.e. in the last year) [this certainly doesn’t apply to everyone] is that there’s often a weird disconnect where, when someone is asked explicitly, they claim to have lots of credence (even 50% or more) outside (C), but everything they say and do is exactly as if all of their credence is in (A-C)—indeed, often as if it’s all in (A).[6]
I think the charitable explanation for this discrepancy is that a TAI scenario within categories (A-C) is more urgent and tractable, and therefore we should focus conversation on that scenario, even while acknowledging the possibility that this scenario won’t happen because LLMs will plateau.
The uncharitable explanation is that some people’s professed beliefs are out-of-sync with their true beliefs—and if so, maybe they should sort that out.
Q: If LLMs will plateau, then can we rest assured that TAI is many decades away?
A: I think that I and my fellow LLM-plateau-ists can feel pretty good that TAI won’t happen in 2023. Probably not 2024 either. Once we start getting much further out than that, I think we should be increasingly uncertain.
The field of AI is wildly different today than it was 10 or 20 years ago. By the same token, even without TAI, we should expect that the field of AI will be wildly different 10 or 20 years into the future. I think 10 or 20 years is more than enough time for currently-underdeveloped (or even currently-nonexistent) AI techniques to be invented, developed, extensively iterated, refined, and scaled.
So even if you’re an LLM plateau-ist, I don’t think you get to feel super-confident that TAI won’t happen in the next 10 or 20 years. Maybe, maybe not. Nobody knows. Technological forecasting is very hard.
(By the way: Even if TAI were known to be many decades away, we should still be working frantically right now on AI alignment, for reasons here.)
(If you think TAI is definitely decades-to-centuries away because the human brain is super complicated, I have a specific response to that here & here.)
Q: If LLMs will plateau, how does that impact governance and “the pause”?
A: As above, I think very dangerous TAI will come eventually, and that we’re extremely not ready for it, and that we’re making slow but steady progress right now on getting ready, and so I’d much rather it come later than sooner. (So far this is independent of whether LLMs will plateau.) (More details and responses to common counter-arguments here.)
Relatedly, there has been a recent push for “pausing giant AI experiments” spearheaded by the Future of Life Institute (FLI).
My take is: a “pause” in training unprecedentedly large ML models is probably good if TAI will look like (A-B), maybe good if TAI will look like (C), and probably counterproductive if TAI will be outside (C).
Why? The biggest problem in my mind is algorithmic progress. If we’re outside (C), then the “critical path to TAI” right now is algorithmic progress. Granted, scaling would need to happen at some point, but not yet, and perhaps not much scaling if any—I think there are strong reasons to expect that a GPT-4 level of scale (or even less, or even much much less) is plenty for TAI, given better algorithms.[7]
(Algorithmic progress is relevant less and less as we move towards (A), but still relevant to some extent even in (A).)
My guess is that scaling and algorithmic progress are currently trading off against each other, for various reasons. So interventions against scaling would cause faster algorithmic progress, which is bad from my perspective.
(Incidentally, at least one non-LLM-plateau-ist also opposes “pause” for reasons related to algorithmic progress.)
The obvious follow-up question is: “OK then how do we intervene to slow down algorithmic progress towards TAI?” The most important thing IMO is to keep TAI-relevant algorithmic insights and tooling out of the public domain (arxiv, github, NeurIPS, etc.). I appeal to AI researchers to not publicly disclose their TAI-relevant ideas, and to their managers to avoid basing salary, hiring, and promotion decisions on open publications.[8] Researchers in almost every other private-sector industry publish far less than do private-sector ML/AI researchers. Consider SpaceX, for example.
I will also continue to spend some of my time trying to produce good pedagogy about AI x-risk, and to engage in patient, good-faith arguments (as opposed to gotchas) when the subject comes up, and to do the research that may lead to more crisp and rigorous arguments for why AI doom is likely (if indeed it’s likely). I encourage others to continue doing all those things too.
To be sure, I don’t think “slow down algorithmic progress towards TAI by trying to win over AI researchers and ask nicely for a change to AI research culture” is an intervention that will buy us much time, but maybe a bit, and it’s obviously a good thing to do regardless, and anyway I don’t have any better ideas.
(Thanks Linda Linsefors & Seth Herd for critical comments on a draft.)
If that’s not obvious, consider (as an existence proof) that it will eventually be possible to run brain-like algorithms on computer chips, as smart and insightful as any human, but thinking 100× faster, and there could be trillions of them with trillions of teleoperated robot bodies and so on. More discussion here.
It’s quite possible that there is more than one viable path to TAI, in which case the question is: which one will happens first? I am implicitly assuming in this post that the (A-B) LLMs are far enough along (compared to other approaches) that either they will plateau soon or they will “win the race”. If you like, you can replace the phrase “LLMs will plateau” with the weaker “LLMs will plateau or at least revert to a much much slower rate of improvement such that other paths to TAI will happen instead.”
Stating my own opinions without justifying them: I put pretty low weight on (A-B), mainly for theoretical reasons. I’m nervous to confidently say that (C) won’t happen, because (C) is a broad category including lots of possibilities that have never occurred to me. But I put pretty low weight on it anyway. (After all, (D) & (E) include even more possibilities that have never occurred to me!) I put negligible weight on (F). I seem to be the only full-time AI x-risk researcher who treats (E) as the most likely possibility. Heck, maybe I’m the only full-time AI x-risk researcher who treats (E) as a possibility at all. (Human brains have “neural nets”, but they’re not “deep”, and they differ from DNNs in various other ways too. Experts disagree about whether any of those differences are important.) But (E)-versus-(D) is not too safety-relevant anyway, in my opinion—my research interest is in safety/alignment for model-based RL AGI, and model-based RL AGI could exist in (E) or (D) or (C), and it doesn’t matter too much which from a safety perspective, AFAICT.
Explanation for the unfamiliar: From my perspective, developing TAI soon is bad—see final section below. I have various idiosyncratic grand ideas about the requirements for powerful AI and limitations of LLMs. Maybe those ideas are wrong and stupid, in which case, it’s just as well that I don’t spread them. Or maybe they’re right, in which case, I also don’t want to spread them, because they could help make TAI come marginally sooner. (Spreading the ideas would also have various benefits, but I think the costs mostly outweigh the benefits here.)
For my part, I find it very hard to imagine that a literal robot body will be necessary to create an AI that poses an x-risk. I guess I’m somewhat open-minded to the possibility that a virtual robot body in a VR environment could be necessary (or at least helpful) during at least part of the training. And I think it’s quite likely that the AI needs to have an “action space” of some sort to reach TAI levels, even if it’s non-body-ish things like “virtually opening a particular text document to a particular page”.
For example, my sense is that some people will explicitly say “scale is all you need”, but many more people will effectively assume that “scale is all you need” when they’re guessing when TAI will arrive, what it will look like, what its alignment properties will be, how much compute it will involve, etc. To elaborate on one aspect of that, there’s been a lot of alignment-of-(A) discourse (simulators, waluigi, shoggoth memes, etc.), and to me it’s far from obvious that this discourse would continue to be centrally relevant in the broader category of (C). For example, it seems to me that there are at least some possible (C)-type systems for which we’ll need to be thinking about their safety using mostly “classic agent alignment discourse” (instrumental convergence, goal misgeneralization, etc.) instead.
For example, the calculations of a human brain entail fewer FLOP/s than a single good GPU (details). The brain might have more memory capacity than one GPU, although my current guess is it doesn’t; anyway, the brain almost definitely has less memory capacity than 1000 GPUs.
There are cases where public disclosure has specific benefits (helping safety / alignment) that outweighs its costs (making TAI come sooner). It can be a tricky topic. But I think many AI researchers don’t even see timeline-shortening as a cost at all, but rather a benefit.