Sam Altman apparently claims OpenAI doesn’t plan to do recursive self improvement
Nate Silver’s new book On the Edge contains interviews with Sam Altman. Here’s a quote from Chapter ∞ that stuck out to me (bold mine):
Yudkowsky worries that the takeoff will be faster than what humans will need to assess the situation and land the plane. We might eventually get the AIs to behave if given enough chances, he thinks, but early prototypes often fail, and Silicon Valley has an attitude of “move fast and break things.” If the thing that breaks is civilization, we won’t get a second try.
Footnote: This is particularly worrisome if AIs become self-improving, meaning you train an AI on how to make a better AI. Even Altman told me that this possibility is “really scary” and that OpenAI isn’t pursuing it.
I’m pretty confused about why this quote is in the book. OpenAI has never (to my knowledge) made public statements about not using AI to automate AI research, and my impression was that automating AI research is explicitly part of OpenAI’s plan. My best guess is that there was a misunderstanding in the conversation between Silver and Altman.
I looked a bit through OpenAI’s comms to find quotes about automating AI research, but I didn’t find many.
If the model is able to conduct AI research fully autonomously, it could set off an intelligence explosion.
Footnote: By intelligence explosion, we mean a cycle in which the AI system improves itself, which makes the system more capable of more improvements, creating a runaway process of self-improvement. A concentrated burst of capability gains could outstrip our ability to anticipate and react to them.
AI that can accelerate science is a special case worth thinking about, and perhaps more impactful than everything else. It’s possible that AGI capable enough to accelerate its own progress could cause major changes to happen surprisingly quickly (and even if the transition starts slowly, we expect it to happen pretty quickly in the final stages). We think a slower takeoff is easier to make safe, and coordination among AGI efforts to slow down at critical junctures will likely be important (even in a world where we don’t need to do this to solve technical alignment problems, slowing down may be important to give society enough time to adapt).
There are some quotes from Sam Altman’s personal blog posts from 2015 (bold mine):
It’s very hard to know how close we are to machine intelligence surpassing human intelligence. Progression of machine intelligence is a double exponential function; human-written programs and computing power are getting better at an exponential rate, and self-learning/self-improving software will improve itself at an exponential rate. Development progress may look relatively slow and then all of a sudden go vertical—things could get out of control very quickly (it also may be more gradual and we may barely perceive it happening).
As mentioned earlier, it is probably still somewhat far away, especially in its ability to build killer robots with no help at all from humans. But recursive self-improvement is a powerful force, and so it’s difficult to have strong opinions about machine intelligence being ten or one hundred years away.
Given how disastrous a bug could be, [regulation should] require development safeguards to reduce the risk of the accident case. For example, beyond a certain checkpoint, we could require development happen only on airgapped computers, require that self-improving software require human intervention to move forward on each iteration, require that certain parts of the software be subject to third-party code reviews, etc. I’m not very optimistic than any of this will work for anything except accidental errors—humans will always be the weak link in the strategy (see the AI-in-a-box thought experiments). But it at least feels worth trying.
OpenAI intends to use Strawberry to perform research. . . .
Among the capabilities OpenAI is aiming Strawberry at is performing long-horizon tasks (LHT), the document says, referring to complex tasks that require a model to plan ahead and perform a series of actions over an extended period of time, the first source explained.
To do so, OpenAI is creating, training and evaluating the models on what the company calls a “deep-research” dataset, according to the OpenAI internal documentation. . . .
OpenAI specifically wants its models to use these capabilities to conduct research by browsing the web autonomously with the assistance of a “CUA,” or a computer-using agent, that can take actions based on its findings, according to the document and one of the sources. OpenAI also plans to test its capabilities on doing the work of software and machine learning engineers.
“require that self-improving software require human intervention to move forward on each iteration”
is the unspoken distinction occurring here, how constant the feedback loop is for self-improvement.
So, people talk about recursive self-improvement, but mean two separate things, one is recursive self-improving models that require no human intervention to move forward on each iteration (perhaps there no longer is an iterative release process, the model is dynamic and constantly improving), and the other is somewhat the current step paradigm where we get a GPT-N+1 model that is 100x the effective compute of GPT-N.
So Sam says, no way do we want a constant curve of improvement, we want a step function. In both cases models contribute to AI research, in one case it contributes to the next gen, in the other case it improves itself.
Sam Altman apparently claims OpenAI doesn’t plan to do recursive self improvement
Nate Silver’s new book On the Edge contains interviews with Sam Altman. Here’s a quote from Chapter ∞ that stuck out to me (bold mine):
I’m pretty confused about why this quote is in the book. OpenAI has never (to my knowledge) made public statements about not using AI to automate AI research, and my impression was that automating AI research is explicitly part of OpenAI’s plan. My best guess is that there was a misunderstanding in the conversation between Silver and Altman.
I looked a bit through OpenAI’s comms to find quotes about automating AI research, but I didn’t find many.
There’s this quote from page 11 of the Preparedness Framework:
In Planning for AGI and beyond, they say this:
There are some quotes from Sam Altman’s personal blog posts from 2015 (bold mine):
Another 2015 blog post (bold mine):
I wouldn’t trust an Altman quote in a book tbh. In fact, I think it’s reasonable to not trust what Altman says in general.
I agree.
Another source:
I have a guess that this:
“require that self-improving software require human intervention to move forward on each iteration”
is the unspoken distinction occurring here, how constant the feedback loop is for self-improvement.
So, people talk about recursive self-improvement, but mean two separate things, one is recursive self-improving models that require no human intervention to move forward on each iteration (perhaps there no longer is an iterative release process, the model is dynamic and constantly improving), and the other is somewhat the current step paradigm where we get a GPT-N+1 model that is 100x the effective compute of GPT-N.
So Sam says, no way do we want a constant curve of improvement, we want a step function. In both cases models contribute to AI research, in one case it contributes to the next gen, in the other case it improves itself.