Currently, OpenAI has a clear lead over its competitors.[1] This is arguably the safest arrangement as far as race dynamics go, because it gives OpenAI some breathing room in case they ever need to slow down later on for safety reasons, and also because their competitors don’t necessarily have a strong reason to think they can easily sprint to catch up.
So far as I can tell, this petition would just be asking OpenAI to burn six months of that lead and let other players catch up. That might create a very dangerous race dynamic, where now you have multiple players neck-and-neck, each with a credible claim to have a chance to get into the lead.
(And I’ll add: while OpenAI has certainly made decisions I disagree with, at least they actively acknowledge existential safety concerns and have a safety plan and research agenda. I’d much rather they be in the lead than Meta, Baidu, the Chinese government, etc., all of whom to my knowledge have almost no active safety research and in some cases are actively dismissive of the need for such.)
One might have considered Google/DeepMind to be OpenAI’s peer, but after the release of Bard — which is substantially behind GPT-3.5 capabilities, never mind GPT-4 — I think this is a hard view to hold.
DeepMind might be more cautious about what it releases, and/or developing systems whose power is less legible than GPT. I have no real evidence here, just vague intuitions.
On the other hand, why did news reports[1] suggest that Google was caught flat-footed by ChatGPT and re-oriented to rush Bard to market?
My sense is that Google/DeepMind’s lethargy in the area of language models is due to a combination of a few factors:
They’ve diversified their bets to include things like protein folding, fusion plasma control, etc. which are more application-driven and not on an AGI path.
They’ve focused more on fundamental research and less on productizing and scaling.
Their language model experts might have a somewhat high annual attrition rate.
I just looked up the authors on Google Brain’s Attention is All You Need, and all but one have left Google after 5.25 years, many for startups, and one for OpenAI. That works out to an annual attrition of 33%.
For DeepMind’s Chinchilla paper, 6 of 22 researchers have been lost in 1 year: 4 to OpenAI and 2 to startups. That’s 27% annual attrition.
By contrast, 16 or 17 of the 30 authors on the GPT-3 paper seem to still be at OpenAI, 2.75 years later, which works out to 20% annual attrition. Notably, of those who have left, not a one has left for Google or DeepMind, though interestingly, 8 have left for Anthropic. (Admittedly, this somewhat reflects the relative newness and growth rates of Google/DeepMind, OpenAI, and Anthropic, since a priori we expect more migration from slow-growing orgs to fast-growing orgs than vice versa.)
It’s broadly reported that Google as an organization struggles with stifling bureaucracy and a lack of urgency. (This was also my observation working there more than ten years ago, and I expect it’s gotten worse since.)
OpenAI seems to also have been caught flat-footed by ChatGPT, or more specifically by the success it got. It seems like the success came largely from the chat interface that made it intuitive for people on the street to use—and none of the LLM techies at any company realized what a difference that would make.
Yes, although the chat interface was necessary but insufficient. They also needed a capable language model behind it, which OpenAI already had, and Google still lacks months later.
I think talking about Google/DeepMind as a unitary entity is a mistake. I’m gonna guess that Peter agrees, and that’s why he specified DeepMind. Google’s publications identify at least two internal language models superior to Lambda, so their release of Bard based on Lambda doesn’t tell us much. They are certainly behind in commercializing chatbots, but is that a weak claim. How DeepMind compares to OpenAI is difficult. Four people going to OpenAI is damning, though.
A somewhat reliable source has told me that they don’t have the compute infrastructure to support making a more advanced model available to users.
That might also reflect limited engineering efforts to optimize state-of-the-art models for real world usage (think of the performance gains from GPT-3.5 Turbo) as opposed to hitting benchmarks for a paper to be published.
That might be true if nothing is actually done in the 6+ months to improve AI safety and governance. But the letter proposes:
AI labs and independent experts should use this pause to jointly develop and implement a set of shared safety protocols for advanced AI design and development that are rigorously audited and overseen by independent outside experts. These protocols should ensure that systems adhering to them are safe beyond a reasonable doubt.[4] This does not mean a pause on AI development in general, merely a stepping back from the dangerous race to ever-larger unpredictable black-box models with emergent capabilities.
AI research and development should be refocused on making today’s powerful, state-of-the-art systems more accurate, safe, interpretable, transparent, robust, aligned, trustworthy, and loyal.
In parallel, AI developers must work with policymakers to dramatically accelerate development of robust AI governance systems. These should at a minimum include: new and capable regulatory authorities dedicated to AI; oversight and tracking of highly capable AI systems and large pools of computational capability; provenance and watermarking systems to help distinguish real from synthetic and to track model leaks; a robust auditing and certification ecosystem; liability for AI-caused harm; robust public funding for technical AI safety research; and well-resourced institutions for coping with the dramatic economic and political disruptions (especially to democracy) that AI will cause.
If a major fraction of all resources at the top 5–10 labs were reallocated to “us[ing] this pause to jointly develop and implement a set of shared safety protocols”, that seems like it would be a good thing to me.
However, the letter offers no guidance as to what fraction of resources to dedicate to this joint safety work. Thus, we can expect that DeepMind and others might each devote a couple teams to that effort, but probably not substantially halt progress at their capabilities frontier.
The only player who is effectively being asked to halt progress at its capabilities frontier is OpenAI, and that seems dangerous to me for the reasons I stated above.
Currently, OpenAI has a clear lead over its competitors.[1] This is arguably the safest arrangement as far as race dynamics go, because it gives OpenAI some breathing room in case they ever need to slow down later on for safety reasons, and also because their competitors don’t necessarily have a strong reason to think they can easily sprint to catch up.
So far as I can tell, this petition would just be asking OpenAI to burn six months of that lead and let other players catch up. That might create a very dangerous race dynamic, where now you have multiple players neck-and-neck, each with a credible claim to have a chance to get into the lead.
(And I’ll add: while OpenAI has certainly made decisions I disagree with, at least they actively acknowledge existential safety concerns and have a safety plan and research agenda. I’d much rather they be in the lead than Meta, Baidu, the Chinese government, etc., all of whom to my knowledge have almost no active safety research and in some cases are actively dismissive of the need for such.)
One might have considered Google/DeepMind to be OpenAI’s peer, but after the release of Bard — which is substantially behind GPT-3.5 capabilities, never mind GPT-4 — I think this is a hard view to hold.
DeepMind might be more cautious about what it releases, and/or developing systems whose power is less legible than GPT. I have no real evidence here, just vague intuitions.
I agree that those are possibilities.
On the other hand, why did news reports[1] suggest that Google was caught flat-footed by ChatGPT and re-oriented to rush Bard to market?
My sense is that Google/DeepMind’s lethargy in the area of language models is due to a combination of a few factors:
They’ve diversified their bets to include things like protein folding, fusion plasma control, etc. which are more application-driven and not on an AGI path.
They’ve focused more on fundamental research and less on productizing and scaling.
Their language model experts might have a somewhat high annual attrition rate.
I just looked up the authors on Google Brain’s Attention is All You Need, and all but one have left Google after 5.25 years, many for startups, and one for OpenAI. That works out to an annual attrition of 33%.
For DeepMind’s Chinchilla paper, 6 of 22 researchers have been lost in 1 year: 4 to OpenAI and 2 to startups. That’s 27% annual attrition.
By contrast, 16 or 17 of the 30 authors on the GPT-3 paper seem to still be at OpenAI, 2.75 years later, which works out to 20% annual attrition. Notably, of those who have left, not a one has left for Google or DeepMind, though interestingly, 8 have left for Anthropic. (Admittedly, this somewhat reflects the relative newness and growth rates of Google/DeepMind, OpenAI, and Anthropic, since a priori we expect more migration from slow-growing orgs to fast-growing orgs than vice versa.)
It’s broadly reported that Google as an organization struggles with stifling bureaucracy and a lack of urgency. (This was also my observation working there more than ten years ago, and I expect it’s gotten worse since.)
e.g. this from the New York Times
OpenAI seems to also have been caught flat-footed by ChatGPT, or more specifically by the success it got. It seems like the success came largely from the chat interface that made it intuitive for people on the street to use—and none of the LLM techies at any company realized what a difference that would make.
Yes, although the chat interface was necessary but insufficient. They also needed a capable language model behind it, which OpenAI already had, and Google still lacks months later.
I think talking about Google/DeepMind as a unitary entity is a mistake. I’m gonna guess that Peter agrees, and that’s why he specified DeepMind. Google’s publications identify at least two internal language models superior to Lambda, so their release of Bard based on Lambda doesn’t tell us much. They are certainly behind in commercializing chatbots, but is that a weak claim. How DeepMind compares to OpenAI is difficult. Four people going to OpenAI is damning, though.
A somewhat reliable source has told me that they don’t have the compute infrastructure to support making a more advanced model available to users.
That might also reflect limited engineering efforts to optimize state-of-the-art models for real world usage (think of the performance gains from GPT-3.5 Turbo) as opposed to hitting benchmarks for a paper to be published.
That might be true if nothing is actually done in the 6+ months to improve AI safety and governance. But the letter proposes:
If a major fraction of all resources at the top 5–10 labs were reallocated to “us[ing] this pause to jointly develop and implement a set of shared safety protocols”, that seems like it would be a good thing to me.
However, the letter offers no guidance as to what fraction of resources to dedicate to this joint safety work. Thus, we can expect that DeepMind and others might each devote a couple teams to that effort, but probably not substantially halt progress at their capabilities frontier.
The only player who is effectively being asked to halt progress at its capabilities frontier is OpenAI, and that seems dangerous to me for the reasons I stated above.
It’s not clear to me that OpenAI has a clear lead over Anthropic in terms of capabilities.
I believe Anthropic is committed to not pushing at the state-of-the-art, so they may not be the most relevant player in discussions of race dynamics.