I had a potential disagreement with your claim that a pause is probably counterproductive if there’s a paradigm change required to reach AGI: even if the algorithms of the current paradigm aren’t directly a part of the algorithm behind existentially dangerous AGI, advances in these algorithms will massively speed up research and progress towards this goal.
My take is: a “pause” in training unprecedentedly large ML models is probably good if TAI will look like (A-B), maybe good if TAI will look like (C), and probably counterproductive if TAI will be outside (C).
+
The obvious follow-up question is: “OK then how do we intervene to slow down algorithmic progress towards TAI?” The most important thing IMO is to keep TAI-relevant algorithmic insights and tooling out of the public domain (arxiv, github, NeurIPS, etc.).
This seems slightly contradictory to me. It seems like—whether or not the current paradigm results in TAI, it is certainly going to make it easier to code faster, communicate ideas faster, write faster, and research faster—which would potentially make all sorts of progress, including algorithmic insights, come sooner. A pause to the current paradigm would thus be neutral or net positive if it means ensuring progress isn’t sped up by more powerful models
I think that’s one consideration, but I think there are a bunch of considerations pointing in both directions. For example:
Pause in scaling up LLMs → less algorithmic progress:
The LLM code-assistants or research-assistants will be worse
Maybe you can only make algorithmic progress via doing lots of GPT-4-sized training runs or bigger and seeing what happens
Maybe pause reduces AI profit which would otherwise be reinvested in R&D
Pause in scaling up LLMs → more algorithmic progress:
Maybe doing lots of GPT-4-sized training runs or bigger is a distraction from algorithmic progress
In pause-world, it’s cheaper to get to the cutting edge, so more diverse researchers & companies are there, and they’re competing more narrowly on algorithmic progress (e.g. the best algorithms will get the highest scores on benchmarks or whatever, as opposed to whatever algorithms got scaled the most getting the highest scores)
Other things:
Pro-pause: It’s “practice for later”, “policy wins beget policy wins”, etc., so it will be easier next time (related)
Anti-pause: People will learn to associate “AI pause” = “overreaction to a big nothing”, so it will be harder next time (related)
Pro-pause: Needless to say, maybe I’m wrong and LLMs won’t plateau!
There are probably other things too. For me, the balance of considerations is that pause in scaling up LLMs will probably lead to more algorithmic progress. But I don’t have great confidence.
(We might differ in how much of a difference we’re expecting LLM code-assistants and research-assistants to make. I put them in the same category as PyTorch and TensorFlow and IDEs and stackoverflow and other such productivity-enhancers that we’re already living with, as opposed to something wildly more impactful than that.)
For me, the balance of considerations is that pause in scaling up LLMs will probably lead to more algorithmic progress
I’d consider this to be one of the more convincing reasons to be hesitant about a pause (as opposed to the ‘crying wolf’ argument, which seems to me like a dangerous way to think about coordinating on AI safety?).
I don’t have a good model for how much serious effort is currently going into algorithmic progress, so I can’t say anything confidently there—but I would guess there’s plenty and it’s just not talked about?
It might be a question about which of the following two you think will most likely result in a dangerous new paradigm faster (assuming LLMs aren’t the dangerous paradigm):
current amount of effort put into algorithmic progress + amplified by code assistants, apps, tools, research-assistants, etc.
counterfactual amount of effort put into algorithmic progress if a pause happens on scaling
I think I’m leaning towards (1) bringing about a dangerous new paradigm faster because
I don’t think the counterfactual amount of effort on algorithmic progress will be that much more significant than the current efforts (pretty uncertain on this, though)
I’m weary of adding faster feedback loops to technological progress/allowing avenues for meta-optimizations to humanity since these can compound
I’d consider this to be one of the more convincing reasons to be hesitant about a pause (as opposed to the ‘crying wolf’ argument, which seems to me like a dangerous way to think about coordinating on AI safety?).
Can you elaborate on this? I think it’s incredibly stupid that people consider it to be super-blameworthy to overprepare for something that turned out not to be a huge deal—even if the expected value of the preparation was super-positive given what was known at the time. But, stupid as it may be, it does seem to be part of the situation we’re in. (What politician wants an article like this to be about them?) (Another example.) I’m in favor of interventions to try to change that aspect of our situation (e.g. widespread use and normalization of prediction markets??), but in the meantime, it seems to me that we should keep that dynamic in mind (among other considerations). Do you disagree with that in principle? Or think it’s overridden by other considerations? Or something else?
The AI safety/alignment crowd was irrationally terrified of chatbots/current AI, forced everyone to pause, and then, unsurprisingly, didn’t find anything scary
The AI safety/alignment crowd need time to catch up their alignment techniques to keep up with the current models before things get dangerous in the future, and they did that
To point (1): alignment researchers aren’t terrified of GPT-4 taking over the world, wouldn’t agree to this characterization, and are not communicating this to others. I don’t expect this is how things will be interpreted if people are being fair.
I think (2) is the realistic spin, and could go wrong reputationally (like in the examples you showed) if there’s no interesting scientific alignment progress made in the pause-period. I don’t expect there to be a lack of interesting progress, though. There’s plenty of unexplored work in interpretability alone that could provide many low-hanging fruit results. This is something I naturally expect out of a young field with a huge space of unexplored empirical and theoretical questions. If there’s plenty of alignment research output during that time, then I’m not sure the pause will really be seen as a failure.
I’m in favor of interventions to try to change that aspect of our situation
Yeah, agree. I’d say one of the best ways to do this is to make it clear what the purpose of the pause is and defining what counts as the pause being a success (e.g. significant research output).
Also, your pro-pause points seem quite important, in my opinion, and outweigh the ‘reputational risks’ by a lot:
Pro-pause: It’s “practice for later”, “policy wins beget policy wins”, etc., so it will be easier next time
Pro-pause: Needless to say, maybe I’m wrong and LLMs won’t plateau!
I’d honestly find it a bit surprising if the reaction to this was to ignore future coordination for AI safety with a high probability. “Pausing to catch up alignment work” doesn’t seem like the kind of thing which leads the world to think “AI can never be existentially dangerous” and results in future coordination being harder. If AI keeps being more impressive than the SOTA now, I’m not really sure risk concerns will easily go away.
I had a potential disagreement with your claim that a pause is probably counterproductive if there’s a paradigm change required to reach AGI: even if the algorithms of the current paradigm aren’t directly a part of the algorithm behind existentially dangerous AGI, advances in these algorithms will massively speed up research and progress towards this goal.
+
This seems slightly contradictory to me. It seems like—whether or not the current paradigm results in TAI, it is certainly going to make it easier to code faster, communicate ideas faster, write faster, and research faster—which would potentially make all sorts of progress, including algorithmic insights, come sooner. A pause to the current paradigm would thus be neutral or net positive if it means ensuring progress isn’t sped up by more powerful models
I think that’s one consideration, but I think there are a bunch of considerations pointing in both directions. For example:
Pause in scaling up LLMs → less algorithmic progress:
The LLM code-assistants or research-assistants will be worse
Maybe you can only make algorithmic progress via doing lots of GPT-4-sized training runs or bigger and seeing what happens
Maybe pause reduces AI profit which would otherwise be reinvested in R&D
Pause in scaling up LLMs → more algorithmic progress:
Maybe doing lots of GPT-4-sized training runs or bigger is a distraction from algorithmic progress
In pause-world, it’s cheaper to get to the cutting edge, so more diverse researchers & companies are there, and they’re competing more narrowly on algorithmic progress (e.g. the best algorithms will get the highest scores on benchmarks or whatever, as opposed to whatever algorithms got scaled the most getting the highest scores)
Other things:
Pro-pause: It’s “practice for later”, “policy wins beget policy wins”, etc., so it will be easier next time (related)
Anti-pause: People will learn to associate “AI pause” = “overreaction to a big nothing”, so it will be harder next time (related)
Pro-pause: Needless to say, maybe I’m wrong and LLMs won’t plateau!
There are probably other things too. For me, the balance of considerations is that pause in scaling up LLMs will probably lead to more algorithmic progress. But I don’t have great confidence.
(We might differ in how much of a difference we’re expecting LLM code-assistants and research-assistants to make. I put them in the same category as PyTorch and TensorFlow and IDEs and stackoverflow and other such productivity-enhancers that we’re already living with, as opposed to something wildly more impactful than that.)
I’d consider this to be one of the more convincing reasons to be hesitant about a pause (as opposed to the ‘crying wolf’ argument, which seems to me like a dangerous way to think about coordinating on AI safety?).
I don’t have a good model for how much serious effort is currently going into algorithmic progress, so I can’t say anything confidently there—but I would guess there’s plenty and it’s just not talked about?
It might be a question about which of the following two you think will most likely result in a dangerous new paradigm faster (assuming LLMs aren’t the dangerous paradigm):
current amount of effort put into algorithmic progress + amplified by code assistants, apps, tools, research-assistants, etc.
counterfactual amount of effort put into algorithmic progress if a pause happens on scaling
I think I’m leaning towards (1) bringing about a dangerous new paradigm faster because
I don’t think the counterfactual amount of effort on algorithmic progress will be that much more significant than the current efforts (pretty uncertain on this, though)
I’m weary of adding faster feedback loops to technological progress/allowing avenues for meta-optimizations to humanity since these can compound
Can you elaborate on this? I think it’s incredibly stupid that people consider it to be super-blameworthy to overprepare for something that turned out not to be a huge deal—even if the expected value of the preparation was super-positive given what was known at the time. But, stupid as it may be, it does seem to be part of the situation we’re in. (What politician wants an article like this to be about them?) (Another example.) I’m in favor of interventions to try to change that aspect of our situation (e.g. widespread use and normalization of prediction markets??), but in the meantime, it seems to me that we should keep that dynamic in mind (among other considerations). Do you disagree with that in principle? Or think it’s overridden by other considerations? Or something else?
Maybe—I can see it being spun in two ways:
The AI safety/alignment crowd was irrationally terrified of chatbots/current AI, forced everyone to pause, and then, unsurprisingly, didn’t find anything scary
The AI safety/alignment crowd need time to catch up their alignment techniques to keep up with the current models before things get dangerous in the future, and they did that
To point (1): alignment researchers aren’t terrified of GPT-4 taking over the world, wouldn’t agree to this characterization, and are not communicating this to others. I don’t expect this is how things will be interpreted if people are being fair.
I think (2) is the realistic spin, and could go wrong reputationally (like in the examples you showed) if there’s no interesting scientific alignment progress made in the pause-period.
I don’t expect there to be a lack of interesting progress, though. There’s plenty of unexplored work in interpretability alone that could provide many low-hanging fruit results. This is something I naturally expect out of a young field with a huge space of unexplored empirical and theoretical questions. If there’s plenty of alignment research output during that time, then I’m not sure the pause will really be seen as a failure.
Yeah, agree. I’d say one of the best ways to do this is to make it clear what the purpose of the pause is and defining what counts as the pause being a success (e.g. significant research output).
Also, your pro-pause points seem quite important, in my opinion, and outweigh the ‘reputational risks’ by a lot:
Pro-pause: It’s “practice for later”, “policy wins beget policy wins”, etc., so it will be easier next time
Pro-pause: Needless to say, maybe I’m wrong and LLMs won’t plateau!
I’d honestly find it a bit surprising if the reaction to this was to ignore future coordination for AI safety with a high probability. “Pausing to catch up alignment work” doesn’t seem like the kind of thing which leads the world to think “AI can never be existentially dangerous” and results in future coordination being harder. If AI keeps being more impressive than the SOTA now, I’m not really sure risk concerns will easily go away.