RE discussion of gradual-ness, continuity, early practice, etc.:
FWIW, here’s how I currently envision AGI developing, which seems to be in a similar ballpark as Eliezer’s picture, or at least closer than most people I think? (Mostly presented without argument.)
There’s a possible R&D path that leads to a model-based RL AGI. It would very agent-y, and have some resemblance to human brain algorithms (I claim), and be able to “figure things out” and “mull things over” and have ideas and execute on them, and understand the world and itself, etc., akin to how humans do all those things.
Large language models (LLMs) trained mainly by self-supervised learning (SSL), as built today, are not that path (although they might include some ingredients which would overlap with that path). In my view, those SSL systems are almost definitely safer, and almost definitely much less capable, than the agent-y model-based RL path. For example, I don’t think that the current SSL-LLM path is pointing towards “The last invention that man need ever make”. I won’t defend that claim here.
But meanwhile, like it or not, lots of other people are as we speak racing down the road towards the more brain-like, more agent-y, model-based RL AGI. We should presume that they’ll eventually succeed. We could try to stop them, but doing so seems impossible right now. In the future, the SSL-LLM path will be producing more impressive AI models than today, but I don’t expect that fact to change the world very much in relevant ways, such that we’ll still be in roughly the same situation of not having any way (AFAICT) to stop researchers from inventing agent-y model-based RL AGI. So our only choice is to figure out how to navigate the world in which people eventually build agent-y model-based RL AGI.
(Note—Since “SSL-LLM safety” doesn’t coincide with “agent-y model-based RL safety”, a natural consequence is Eliezer [uncharitably] describing some people’s work as ‘not engaging with the core difficulties’ or whatever.)
Anyway, model-based RL algorithms can already do neat things like play computer games, but they can’t yet conquer humanity, and I think part of that is related to model-based RL algorithmic innovations that have yet to happen. So during some period of time, those future algorithmic innovations will happen, and meanwhile people will be scaling up and optimizing and hardware-accelerating the algorithms and architectures.
(Based on human brain compute requirements, I think training from scratch to at least human-level intelligence will probably eventually be possible with relatively modest amounts of chips and money, see here.)
A key question is: how long is this period between “This specific model-based RL technological path is producing the AIs that everyone is using and everyone is talking about receiving a very large share of overall attention and investment by the ML research community” [reworded for clarification, see later comment] and “This specific model-based RL technological path can produce an out-of-control AGI that could destroy the world”?
Hard to say, but “a couple years” seems entirely plausible to me, and even “zero years (because, until the leading team worked out the kinks, their results weren’t great compared to other very different approaches, and few people were paying attention)” seems plausible. Whereas even “10 years” seems implausibly high to me, I think.
So I find all the arguments in this post related to slow-takeoff, gradual-ness, continuity, etc. to be not so reassuring.
My expectation is that people will turn SSL models into agentic reasoners. I think this will happen through refinements to “chain of thought”-style reasoning approaches. See here. Such approaches absolutely do let LLMs “mull things over” to a limited degree, even with current very crude methods to do chain of thought with current LLMs. I also think future RL advancements will be more easily used to get better chain of thought reasoners, rather than accelerating a new approach to the SOTA.
A key question is: how long is this period between “This specific model-based RL technological path is producing the AIs that everyone is using and everyone is talking about” and “This specific model-based RL technological path can produce an out-of-control AGI that could destroy the world”?
Hard to say, but “a couple years” seems entirely plausible to me, and even “zero years (because, until the leading team worked out the kinks, their results weren’t great compared to other very different approaches, and few people were paying attention)” seems plausible. Whereas even “10 years” seems implausibly high to me, I think.
I don’t think Paul would disagree with you about “a couple years” being plausible, based on Agreements #8 from his post (bold mine):
8. The broader intellectual world seems to wildly overestimate how long it will take AI systems to go from “large impact on the world” to “unrecognizably transformed world.” This is more likely to be years than decades, and there’s a real chance that it’s months. This makes alignment harder and doesn’t seem like something we are collectively prepared for.
At first I read Paul’s post as having very gradualist assumptions all around. But he clarified to me in this comment and the back-and-forth we had in replies that he’s a bit long on the initial time before AI has large impact on the world (similar to your “This specific model-based RL technological path is producing the AIs that everyone is using and everyone is talking about”), which he pegs at ~40% by 2040. After that point, he predicts a pretty speedy progression to “unrecognizably transformed world”, which I think includes the possibility of catastrophe.
I don’t think Paul is saying the same thing as me. My wording was bad, sorry.
When I said “the AIs that everyone is using and everyone is talking about”, I should have said “the AIs that are receiving a very large share of overall attention and investment by the ML research community”. (I just went back and edited the original.)
As of today (2022), large language models are “the AIs that are receiving a very large share of overall attention and investment by the ML research community”. But they are not having a “large impact on the world” by Paul’s definition. For example, the current contribution of large language models to global GDP is ≈0%.
The question of whether an AI approach is “receiving a very large share of overall attention and investment by the ML research community” is very important because:
if yes, we expect low-hanging fruit to be rapidly picked, after which we expect incremental smaller advances perpetually, and we expect state-of-the-art models to be using roughly the maximum amount of compute that is at all possible to use.
if no (i.e. if an AI approach is comparatively a bit of a backwater, like say model-based RL or probabilistic programming today), we should be less surprised by (for example) a flurry of very impactful advances within a short period of time, while most people aren’t paying attention, and then bam, we have a recipe for a superhuman AGI that can be trained on a university GPU cluster.
I suspect that LLMs are going to be put to more and more practical use in the near future. I just did a search on “AI and legal briefs” and came up with ads and articles about “prediction based” systems to help lawyers prepare legal briefs. I assume “prediction based” means LLM.
RE discussion of gradual-ness, continuity, early practice, etc.:
FWIW, here’s how I currently envision AGI developing, which seems to be in a similar ballpark as Eliezer’s picture, or at least closer than most people I think? (Mostly presented without argument.)
There’s a possible R&D path that leads to a model-based RL AGI. It would very agent-y, and have some resemblance to human brain algorithms (I claim), and be able to “figure things out” and “mull things over” and have ideas and execute on them, and understand the world and itself, etc., akin to how humans do all those things.
Large language models (LLMs) trained mainly by self-supervised learning (SSL), as built today, are not that path (although they might include some ingredients which would overlap with that path). In my view, those SSL systems are almost definitely safer, and almost definitely much less capable, than the agent-y model-based RL path. For example, I don’t think that the current SSL-LLM path is pointing towards “The last invention that man need ever make”. I won’t defend that claim here.
But meanwhile, like it or not, lots of other people are as we speak racing down the road towards the more brain-like, more agent-y, model-based RL AGI. We should presume that they’ll eventually succeed. We could try to stop them, but doing so seems impossible right now. In the future, the SSL-LLM path will be producing more impressive AI models than today, but I don’t expect that fact to change the world very much in relevant ways, such that we’ll still be in roughly the same situation of not having any way (AFAICT) to stop researchers from inventing agent-y model-based RL AGI. So our only choice is to figure out how to navigate the world in which people eventually build agent-y model-based RL AGI.
(Note—Since “SSL-LLM safety” doesn’t coincide with “agent-y model-based RL safety”, a natural consequence is Eliezer [uncharitably] describing some people’s work as ‘not engaging with the core difficulties’ or whatever.)
Anyway, model-based RL algorithms can already do neat things like play computer games, but they can’t yet conquer humanity, and I think part of that is related to model-based RL algorithmic innovations that have yet to happen. So during some period of time, those future algorithmic innovations will happen, and meanwhile people will be scaling up and optimizing and hardware-accelerating the algorithms and architectures.
(Based on human brain compute requirements, I think training from scratch to at least human-level intelligence will probably eventually be possible with relatively modest amounts of chips and money, see here.)
A key question is: how long is this period between “This specific model-based RL technological path is
producing the AIs that everyone is using and everyone is talking aboutreceiving a very large share of overall attention and investment by the ML research community” [reworded for clarification, see later comment] and “This specific model-based RL technological path can produce an out-of-control AGI that could destroy the world”?Hard to say, but “a couple years” seems entirely plausible to me, and even “zero years (because, until the leading team worked out the kinks, their results weren’t great compared to other very different approaches, and few people were paying attention)” seems plausible. Whereas even “10 years” seems implausibly high to me, I think.
So I find all the arguments in this post related to slow-takeoff, gradual-ness, continuity, etc. to be not so reassuring.
My expectation is that people will turn SSL models into agentic reasoners. I think this will happen through refinements to “chain of thought”-style reasoning approaches. See here. Such approaches absolutely do let LLMs “mull things over” to a limited degree, even with current very crude methods to do chain of thought with current LLMs. I also think future RL advancements will be more easily used to get better chain of thought reasoners, rather than accelerating a new approach to the SOTA.
I don’t think Paul would disagree with you about “a couple years” being plausible, based on Agreements #8 from his post (bold mine):
At first I read Paul’s post as having very gradualist assumptions all around. But he clarified to me in this comment and the back-and-forth we had in replies that he’s a bit long on the initial time before AI has large impact on the world (similar to your “This specific model-based RL technological path is producing the AIs that everyone is using and everyone is talking about”), which he pegs at ~40% by 2040. After that point, he predicts a pretty speedy progression to “unrecognizably transformed world”, which I think includes the possibility of catastrophe.
I don’t think Paul is saying the same thing as me. My wording was bad, sorry.
When I said “the AIs that everyone is using and everyone is talking about”, I should have said “the AIs that are receiving a very large share of overall attention and investment by the ML research community”. (I just went back and edited the original.)
As of today (2022), large language models are “the AIs that are receiving a very large share of overall attention and investment by the ML research community”. But they are not having a “large impact on the world” by Paul’s definition. For example, the current contribution of large language models to global GDP is ≈0%.
The question of whether an AI approach is “receiving a very large share of overall attention and investment by the ML research community” is very important because:
if yes, we expect low-hanging fruit to be rapidly picked, after which we expect incremental smaller advances perpetually, and we expect state-of-the-art models to be using roughly the maximum amount of compute that is at all possible to use.
if no (i.e. if an AI approach is comparatively a bit of a backwater, like say model-based RL or probabilistic programming today), we should be less surprised by (for example) a flurry of very impactful advances within a short period of time, while most people aren’t paying attention, and then bam, we have a recipe for a superhuman AGI that can be trained on a university GPU cluster.
Ok I see what you mean, thanks for clarifying.
I suspect that LLMs are going to be put to more and more practical use in the near future. I just did a search on “AI and legal briefs” and came up with ads and articles about “prediction based” systems to help lawyers prepare legal briefs. I assume “prediction based” means LLM.