GPT-2 is very far from taking over the world (and was indeed <<10 people). GPT-3 was bigger (though still probably <10 people depending how you amortize infrastructure), and remains far from taking over the world. Modern projects are >10 people, and still not yet taking over the world. It looks like it’s already not super plausible for 10 people to catch up, and it’s rapidly getting less plausible. The prediction isn’t yet settled, but neither are the predictions in Eliezer’s favor, and it’s clear which way the wind blows.
These projects are well-capitalized, with billions of dollars in funding now and valuations rapidly rising (though maybe a dip right now with tech stocks overall down ~25%). These projects need to negotiate absolutely massive compute contracts, and lots of the profit looks likely to flow to compute companies. Most of the work is going into the engineering aspects of these projects. There are many labs with roughly-equally-good approaches, and no one has been able to pull much ahead of the basic formula—most variation is explained by how big a bet different firms are willing to make.
Eliezer is not talking about 10 people making a dominant AI because the rest of the world is being busy slowing down out of concern for AI risk, he is talking about the opposite situation of 10 people making a dominant AI which is also safer, while people are barreling ahead, which is possible because they are building AI in a better way. In addition to 10 people, the view “you can find a better way to build AI that’s way more efficient than other people” is also starting to look increasingly unlikely as performance continues to be dominated by scale and engineering rather than clever ideas.
Everything is vague enough that I might be totally misunderstanding the view, and there is a lot of slack in how you compare it to reality. But for me the most basic point is that this is not a source of words about the future that I personally should be listening to; if there is a way to turn these words into an accurate visualization of the future, I lack the machinery to do so.
(The world I imagined when I read Robin’s words also looks different from the world of today in a lot of important ways. But it’s just not such a slam dunk comparing them, on this particular axis it sure looks more like Robin’s world to me. I do wish that people had stated some predictions so we could tell precisely rather than playing this game.)
Historical track record of software projects is that it’s relatively common that a small team of ~10 people outperforms 1000+ person teams. Indeed, I feel like this is roughly what happened with Deepmind and OpenAI. I feel like in 2016 you could have said that current AGI projects already have 500+ employees and are likely to grow even bigger and so it’s unlikely that a small 10-person team could catch up, and then suddenly the most cutting-edge project was launched by a 10-person team. (Yes, that 10 person team needed a few million dollars, but a few million dollars are not that hard to come by in the tech-sector).
My current guess is that we will continue to see small 10-person teams push the cutting-edge forward in AI, just as we’ve seen the same in most other domains of software.
In addition to 10 people, the view “you can find a better way to build AI that’s way more efficient than other people” is also starting to look increasingly unlikely as performance continues to be dominated by scale and engineering rather than clever ideas.
I do agree with this in terms of what has been happening in the last few years, though I do expect this to break down as we see more things in the “leveraging AI to improve AI development progress” and “recursive self-improvement stuff” categories, which seem to currently enter the horizon. It already seems pretty plausible to me that a team that had exclusive access to a better version of Codex has some chance of outperforming other software development teams by 3-5x, which would then feed into more progress on the performance of the relevant systems.
I do think this is substantially less sharp than what Eliezer was talking about at the time, but I personally find the “a small team of people who will use AI tools to develop AI systems, can vastly outperform large teams that are less smart about it” hypothesis pretty plausible, and probably more likely than not for what will happen eventually.
I think “team uses Codex to be 3x more productive” is more like the kind of thing Robin is talking about than the kind of thing Eliezer is talking about (e.g. see the discussion of UberTool, or just read the foom debate overall). And if you replace 3x with a more realistic number, and consider the fact that right now everyone is definitely selling that as a product rather than exclusively using it internally as a tool, then it’s even more like Robin’s story.
Everyone involved believes in the possibility of tech startups, and I’m not even sure if they have different views about the expected returns to startup founders. The 10 people who start an AI startup can make a lot of money, and will typically grow to a large scale (with significant dilution, but still quite a lot of influence for founders) before they make their most impressive AI systems.
I think this kind of discussion seems pretty unproductive, and it mostly just reinforces the OP’s point that people should actually predict something about the world if we want this kind of discussion to be remotely useful for deciding how to change beliefs as new evidence comes in (at least about what people / models / reasoning strategies work well). If you want to state any predictions about the next 5 years I’m happy to disagree with them.
The kinds of thing I expect are that (i) big models will still be where it’s at, (ii) compute budgets and team sizes continue to grow, (iii) improvements from cleverness continue to shrink, (iv) influence held by individual researchers grows in absolute terms but continues to shrink in relative terms, (v) AI tools become 2x more useful over more like a year than a week, (vi) AI contributions to AI R&D look similar to human contributions in various ways. Happy to put #s on those if you want to disagree on any. Places where I agree with the foom story are that I expect AI to be applied differentially to AI R&D, I expect the productivity of individual AI systems to scale relatively rapidly with compute and R&D investment, I expect overall progress to qualitatively be large, and so on.
Methodology: I copied text from the contributors page (down to just before it says “We also acknowledge and thank every OpenAI team member”), used some quick Emacs keyboard macros to munge out the section headers and non-name text (like “[topic] lead”), deduplicated and counted in Python (and subtracted one for a munging error I spotted after the fact), and got 290. Also, you might not count some sections of contributors (e.g., product management, legal) as relevant to your claim.
Yep, that is definitely counterevidence! Though my model did definitely predict that we would also continue seeing huge teams make contributions, but of course each marginal major contribution is still evidence.
I have more broadly updated against this hypothesis over the past year or so, though I still think there will be lots of small groups of people quite close to the cutting edge (like less than 12 months behind).
Currently the multiple on stuff like better coding tools and setting up development to be AI-guided just barely entered the stage where it feels plausible that a well-set-up team could just completely destroy large incumbents. We’ll see how it develops in the next year or so.
GPT-2 is very far from taking over the world (and was indeed <<10 people). GPT-3 was bigger (though still probably <10 people depending how you amortize infrastructure), and remains far from taking over the world. Modern projects are >10 people, and still not yet taking over the world. It looks like it’s already not super plausible for 10 people to catch up, and it’s rapidly getting less plausible. The prediction isn’t yet settled, but neither are the predictions in Eliezer’s favor, and it’s clear which way the wind blows.
These projects are well-capitalized, with billions of dollars in funding now and valuations rapidly rising (though maybe a dip right now with tech stocks overall down ~25%). These projects need to negotiate absolutely massive compute contracts, and lots of the profit looks likely to flow to compute companies. Most of the work is going into the engineering aspects of these projects. There are many labs with roughly-equally-good approaches, and no one has been able to pull much ahead of the basic formula—most variation is explained by how big a bet different firms are willing to make.
Eliezer is not talking about 10 people making a dominant AI because the rest of the world is being busy slowing down out of concern for AI risk, he is talking about the opposite situation of 10 people making a dominant AI which is also safer, while people are barreling ahead, which is possible because they are building AI in a better way. In addition to 10 people, the view “you can find a better way to build AI that’s way more efficient than other people” is also starting to look increasingly unlikely as performance continues to be dominated by scale and engineering rather than clever ideas.
Everything is vague enough that I might be totally misunderstanding the view, and there is a lot of slack in how you compare it to reality. But for me the most basic point is that this is not a source of words about the future that I personally should be listening to; if there is a way to turn these words into an accurate visualization of the future, I lack the machinery to do so.
(The world I imagined when I read Robin’s words also looks different from the world of today in a lot of important ways. But it’s just not such a slam dunk comparing them, on this particular axis it sure looks more like Robin’s world to me. I do wish that people had stated some predictions so we could tell precisely rather than playing this game.)
Historical track record of software projects is that it’s relatively common that a small team of ~10 people outperforms 1000+ person teams. Indeed, I feel like this is roughly what happened with Deepmind and OpenAI. I feel like in 2016 you could have said that current AGI projects already have 500+ employees and are likely to grow even bigger and so it’s unlikely that a small 10-person team could catch up, and then suddenly the most cutting-edge project was launched by a 10-person team. (Yes, that 10 person team needed a few million dollars, but a few million dollars are not that hard to come by in the tech-sector).
My current guess is that we will continue to see small 10-person teams push the cutting-edge forward in AI, just as we’ve seen the same in most other domains of software.
I do agree with this in terms of what has been happening in the last few years, though I do expect this to break down as we see more things in the “leveraging AI to improve AI development progress” and “recursive self-improvement stuff” categories, which seem to currently enter the horizon. It already seems pretty plausible to me that a team that had exclusive access to a better version of Codex has some chance of outperforming other software development teams by 3-5x, which would then feed into more progress on the performance of the relevant systems.
I do think this is substantially less sharp than what Eliezer was talking about at the time, but I personally find the “a small team of people who will use AI tools to develop AI systems, can vastly outperform large teams that are less smart about it” hypothesis pretty plausible, and probably more likely than not for what will happen eventually.
I think “team uses Codex to be 3x more productive” is more like the kind of thing Robin is talking about than the kind of thing Eliezer is talking about (e.g. see the discussion of UberTool, or just read the foom debate overall). And if you replace 3x with a more realistic number, and consider the fact that right now everyone is definitely selling that as a product rather than exclusively using it internally as a tool, then it’s even more like Robin’s story.
Everyone involved believes in the possibility of tech startups, and I’m not even sure if they have different views about the expected returns to startup founders. The 10 people who start an AI startup can make a lot of money, and will typically grow to a large scale (with significant dilution, but still quite a lot of influence for founders) before they make their most impressive AI systems.
I think this kind of discussion seems pretty unproductive, and it mostly just reinforces the OP’s point that people should actually predict something about the world if we want this kind of discussion to be remotely useful for deciding how to change beliefs as new evidence comes in (at least about what people / models / reasoning strategies work well). If you want to state any predictions about the next 5 years I’m happy to disagree with them.
The kinds of thing I expect are that (i) big models will still be where it’s at, (ii) compute budgets and team sizes continue to grow, (iii) improvements from cleverness continue to shrink, (iv) influence held by individual researchers grows in absolute terms but continues to shrink in relative terms, (v) AI tools become 2x more useful over more like a year than a week, (vi) AI contributions to AI R&D look similar to human contributions in various ways. Happy to put #s on those if you want to disagree on any. Places where I agree with the foom story are that I expect AI to be applied differentially to AI R&D, I expect the productivity of individual AI systems to scale relatively rapidly with compute and R&D investment, I expect overall progress to qualitatively be large, and so on.
Yeah, I think this is fair. I’ll see whether I can come up with some good operationalizations.
Possible counterevidence (10 months later)?—the GPT-4 contributors list lists almost 300 names.[1]
Methodology: I copied text from the contributors page (down to just before it says “We also acknowledge and thank every OpenAI team member”), used some quick Emacs keyboard macros to munge out the section headers and non-name text (like “[topic] lead”), deduplicated and counted in Python (and subtracted one for a munging error I spotted after the fact), and got 290. Also, you might not count some sections of contributors (e.g., product management, legal) as relevant to your claim.
Yep, that is definitely counterevidence! Though my model did definitely predict that we would also continue seeing huge teams make contributions, but of course each marginal major contribution is still evidence.
I have more broadly updated against this hypothesis over the past year or so, though I still think there will be lots of small groups of people quite close to the cutting edge (like less than 12 months behind).
Currently the multiple on stuff like better coding tools and setting up development to be AI-guided just barely entered the stage where it feels plausible that a well-set-up team could just completely destroy large incumbents. We’ll see how it develops in the next year or so.