Several people have pointed out that this post seems to take a different stance on race dynamics than was expressed previously.
I think it clearly does. From my perspective, Anthropic’s post is misleading either way—either Claude 3 doesn’t outperform its peers, in which case claiming otherwise is misleading, or they are in fact pushing the frontier, in which case they’ve misled people by suggesting that they would not do this.
Also, “We do not believe that model intelligence is anywhere near its limits, and we plan to release frequent updates to the Claude 3 model family over the next few months” doesn’t inspire much confidence that they’re not trying to surpass other models in the near future.
In any case, I don’t see much reason to think that Anthropic is not aiming to push the frontier. For one, to the best of my knowledge they’ve never even publicly stated they wouldn’t; to the extent that people believe it anyway, it is, as best I can tell, mostly just through word of mouth and some vague statements from Dario. Second, it’s hard for me to imagine that they’re pitching investors on a plan that explicitly aims to make an inferior product relative to their competitors. Indeed, their leaked pitch deck suggests otherwise: “We believe that companies that train the best 2025⁄26 models will be too far ahead for anyone to catch up in subsequent cycles.” I think the most straightforward interpretation of this sentence is that Anthropic is racing to build AGI.
And if they are indeed pushing the frontier, this seems like a negative update about them holding to other commitments about safety. Because while it’s true that Anthropic never, to the best of my knowledge, explicitly stated that they wouldn’t do so, they nevertheless appeared to me to strongly imply it. E.g., in his podcast with Dwarkesh, Dario says:
I think we’ve been relatively responsible in the sense that we didn’t cause the big acceleration that happened late last year and at the beginning of this year. We weren’t the ones who did that. And honestly, if you look at the reaction of Google, that might be ten times more important than anything else. And then once it had happened, once the ecosystem had changed, then we did a lot of things to stay on the frontier.
I think we shouldn’t be racing ahead or trying to build models that are way bigger than other orgs are building them. And we shouldn’t, I think, be trying to ramp up excitement or hype about giant models or the latest advances. But we should build the things that we need to do the safety work and we should try to do the safety work as well as we can on top of models that are reasonably close to state of the art.
None of this is Dario saying that Anthropic won’t try to push the frontier, but it certainly heavily suggests that they are aiming to remain at least slightly behind it. And indeed, my impression is that many people expected this from Anthropic, including people who work there, which seems like evidence that this was the implied message.
If Anthropic is in fact attempting to push the frontier, then I think this is pretty bad. They shouldn’t be this vague and misleading about something this important, especially in a way that caused many people to socially support them (and perhaps make decisions to work there). I perhaps cynically think this vagueness was intentional—it seems implausible to me that Anthropic did not know that people believed this yet they never tried to correct it, which I would guess benefited them: safety-conscious engineers are more likely to work somewhere that they believe isn’t racing to build AGI. Hopefully I’m wrong about at least some of this.
In any case, whether or not Claude 3 already surpasses the frontier, soon will, or doesn’t, I request that Anthropic explicitly clarify whether their intention is to push the frontier.
which case they’ve misled people by suggesting that they would not do this.
Neither of your examples seem super misleading to me. I feel like there was some atmosphere of “Anthropic intends to stay behind the frontier” when the actual statements were closer to “stay on the frontier”.
Also worth noting that Claude 3 does not substantially advance the LLM capabilities frontier! Aside from GPQA, it doesn’t do that much better on benchmarks than GPT-4 (and in fact does worse than gpt-4-1106-preview). Releasing models that are comparable to models OpenAI released a year ago seems compatible with “staying behind the frontier”, given OpenAI has continued its scale up and will no doubt soon release even more capable models.
That being said, I agree that Anthropic did benefit in the EA community by having this impression. So compared to the impression many EAs got from Anthropic, this is indeed a different stance.
In any case, whether or not Claude 3 already surpasses the frontier, soon will, or doesn’t, I request that Anthropic explicitly clarify whether their intention is to push the frontier.
The main (only?) limit on scaling is their ability to implement containment/safety measures for ever more dangerous models. E.g.:
That is, they won’t go faster than they can scale up safety procedures, but they’re otherwise fine pushing the frontier.
It’s worth noting that their ASL-3 commitments seem pretty likely to trigger in the next few years, and probably will be substantially difficult to meet:
If one of the effects of instituting a responsible scaling policy was that Anthropic moved from the stance of not meaningfully pushing the frontier to “it’s okay to push the frontier so long as we deem it safe,” this seems like a pretty important shift that was not well communicated. I, for one, did not interpret Anthropic’s RSP as a statement that they were now okay with advancing state of the art, nor did many others; I think that’s because the RSP did not make it clear that they were updating this position. Like, with hindsight I can see how the language in the RSP is consistent with pushing the frontier. But I think the language is also consistent with not pushing it. E.g., when I was operating under the assumption that Anthropic had committed to this, I interpreted the RSP as saying “we’re aiming to scale responsibly to the extent that we scale at all, which will remain at or behind the frontier.”
Attempting to be forthright about this would, imo, look like a clear explanation of Anthropic’s previous stance relative to the new one they were adopting, and their reasons for doing so. To the extent that they didn’t feel the need to do this, I worry that it’s because their previous stance was more of a vibe, and therefore non-binding. But if they were using that vibe to gain resources (funding, talent, etc.), then this seems quite bad to me. It shouldn’t both be the case that they benefit from ambiguity but then aren’t held to any of the consequences of breaking it. And indeed, this makes me pretty wary of other trust/deferral based support that people currently give to Anthropic. I think that if the RSP in fact indicates a departure from their previous stance of not meaningfully pushing the frontier, then this is a negative update about Anthropic holding to the spirit of their commitments.
As one data point: before I joined Anthropic, when I was trying to understand Anthropic’s strategy, I never came away with the impression that Anthropic wouldn’t advance the state of the art. It was quite clear to me that Anthropic’s strategy at the time was more amorphous than that, more like “think carefully about when to do releases and try to advance capabilities for the purpose of doing safety” rather than “never advance the state of the art”. I should also note that now the strategy is actually less amorphous, since it’s now pretty explicitly RSP-focused, more like “we will write RSP commitments that ensure we don’t contribute to catastrophic risk and then scale and deploy only within the confines of the RSP”.
Well, if Dustin sees no problem in talking about it, and it’s become a major policy concern, then I guess I should disclose that I spent a while talking with Dario back in late October 2022 (ie. pre-RSP in Sept 2023), and we discussed Anthropic’s scaling policy at some length, and I too came away with the same impression everyone else seems to have: that Anthropic’s AI-arms-race policy was to invest heavily in scaling, creating models at or pushing the frontier to do safety research on, but that they would only release access to second-best models & would not ratchet capabilities up, and it would wait for someone else to do so before catching up. So it would not contribute to races but not fall behind and become irrelevant/noncompetitive.
And Anthropic’s release of Claude-1 and Claude-2 always seemed to match that policy—even if Claude-2 had a larger context window for a long time than any other decent available model, Claude-2 was still substantially weaker than ChatGPT-4. (Recall that the causus belli for Sam Altman trying to fire Helen Toner from the OA board was a passing reference in a co-authored paper to Anthropic not pushing the frontier like OA did.)
What I’m concluding from the discussion so far is that I should have read the Anthropic RSP more carefully than I did.
Anthropic is in little need of ideas from me, but yeah, I’ll probably pause such things for a while. I’m not saying the RSP is bad, but I’d like to see how things work out.
They indeed did not advance the frontier with this launch (at least not meaningfully, possibly not at all). But “meaningfully advance the frontier” is quite different from both “stay on the frontier” or “slightly push the envelope while creating marketing hype”, which is what I think is going on here?
Yeah, seems plausible; but either way it seems worth noting that Dario left Dustin, Evan and Anthropic’s investors with quite different impressions here.
I interpreted you, in your previous comment, as claiming that Anthropic’s RSP is explicit in its compatibility with meaningfully pushing the frontier. Dustin is under the impression that Anthropic verbally committed otherwise. Whether or not Claude 3 pushed the frontier seems somewhat orthogonal to this question—did Anthropic commit and/or heavily imply that they weren’t going to push the frontier, and if so, does the RSP quietly contradict that commitment? My current read is that the answer to both questions is yes. If this is the case, I think that Anthropic has been pretty misleading about a crucial part of their safety plan, and this seems quite bad to me.
I think that you’re correct that Anthropic at least heavily implied that they weren’t going to “meaningfully advance” the frontier (even if they have not made any explicit commitments about this). I’d be interested in hearing when Dustin had this conversation w/ Dario—was it pre or post RSP release?
And as far as I know, the only commitments they’ve made explicitly are in their RSP, which commits to limiting their ability to scale to the rate at which they can advance and deploy safety measures. It’s unclear if the “sufficient safety measures” limitation is the only restriction on scaling, but I would be surprised if anyone senior Anthropic was willing to make a concrete unilateral commitment to stay behind the curve.
My current story based on public info is, up until mid 2022, there was indeed an intention to stay at the frontier but not push it forward significantly. This changed sometime in late 2022-early 2023, maybe after ChatGPT released and the AGI race became somewhat “hot”.
I feel some kinda missing mood in these comments. It seems like you’re saying “Anthropic didn’t make explicit commitments here”, and that you’re not weighting as particularly important whether they gave people different impressions, or benefited from that.
(AFAICT you haven’t explicitly stated “that’s not a big deal”, but, it’s the vibe I get from your comments. Is that something you’re intentionally implying, or do you think of yourself as mostly just trying to be clear on the factual claims, or something like that?)
The first Dario quote sounds squarely in line with releasing a Claude 3 on par with GPT-4 but well afterwards. The second Dario quote has a more ambiguous connotation, but if read explicitly it strikes me as compatible with the Claude 3 release.
If you spent a while looking for the most damning quotes, then these quotes strike me as evidence the community was just wishfully thinking while in reality Anthropic comms were fairly clear throughout. Privately pitching aggressive things to divert money from more dangerous orgs while minimizing head-on competition with OpenAI seems best to me (though obviously it’s also evidence that they’ll actually do the aggressive scaling things, so hard to know).
To make concrete the disagreement, I’d be interested in people predicting on “If Anthropic releases a GPT-5 equivalent X months behind, then their dollars/compute raised will be Y times lower than OpenAI” for various values of X.
“Diverting money” strikes me as the wrong frame here. Partly because I doubt this actually was the consequence—i.e., I doubt OpenAI etc. had a meaningfully harder time raising capital because of Anthropic’s raise—but also because it leaves out the part where this purported desirable consequence was achieved via (what seems to me like) straightforward deception!
If indeed Dario told investors he hoped to obtain an insurmountable lead soon, while telling Dustin and others that he was committed to avoid gaining any meaningful lead, then it sure seems like one of those claims was a lie. And by my ethical lights, this seems like a horribly unethical thing to lie about, regardless of whether it somehow caused OpenAI to have less money.
I don’t see any direct contradiction/lie there, at least between my version and the investor paraphrase. You don’t necessarily have to release to public general access the best model, in order to be so far ahead that competitors can’t catch up.
For example, LLMs at the research frontier could be a natural (Bertrand?) oligopoly where there’s a stable two-player oligopoly for the best models (#1 by XYZ, and #2 by Anthropic), and everyone else gives up: there is no point in spending $10b to stand up your own model to try to become #1 when XYZ/Anthropic will just cut prices or release the next iteration that they’d been holding back and you get relegated to #3 and there’s no reason for anyone to buy yours instead, and you go bankrupt. (This would be similar to other historical examples like Intel/AMD or Illumina: they enjoyed large margins and competing with them was possible, but was very dangerous because they had a lot of pent-up improvements they could unleash if you spent enough to become a threat. Or in the case of the highly stable iOS/Android mobile duopoly, just being too incredibly capital-intensive to replicate and already low-margin because the creators make their money elsewhere like devices/ads, having commoditized their complement.)
And then presumably at some point you either solve safety or the models are so capable that further improvement is unnecessary or you can’t increase capability; then the need for the AI-arms-race policy is over, and you just do whatever makes pragmatic sense in that brave new world.
It seems plausible that this scenario could happen, i.e., that Anthropic and OpenAI end up in a stable two-player oligopoly. But I would still be pretty surprised if Anthropic’s pitch to investors, when asking for billions of dollars in funding, is that they pre-commit to never release a substantially better product than their main competitor.
How surprising would you say you find the idea of a startup trying to, and successfully raising, not billions but tens of billions of dollars by pitching investors they’re asking that their investment could be canceled at any time at the wave of a hand, the startup pre-commits that the investments will be canceled in the best-case scenario of the product succeeding, & that the investors ought to consider their investment “in the spirit of a donation”?
LLMs at the research frontier could be a natural oligopoly where there’s a stable two-player oligopoly for the best models (#1 by XYZ, and #2 by Anthropic), and everyone else gives up: there is no point in spending $10b to stand up your own model to try to become #1 when XYZ/Anthropic
Absolutely. The increasing cost of training compute and architecture searches, and relatively low cost of inference compute guarantees this. A model that has had more training compute and a better architecture will perform better on more affordable levels of compute across the board. This is also why an Intel or AMD CPU, or Nvidia GPU, is more worth the same amount of silicon than an inferior product.
Wonder why it’s a stable two-player oligopoly and not a straight monopoly? From large corporate buyers preventing a monopoly by buying enough from the 2nd place player to keep them afloat?
I agree that most investment wouldn’t have otherwise gone to OAI. I’d speculate that investments from VCs would likely have gone to some other AI startup which doesn’t care about safety; investments from Google (and other big tech) would otherwise have gone into their internal efforts. I agree that my framing was reductive/over-confident and that plausibly the modal ‘other’ AI startup accelerates capabilities less than Anthropic even if they don’t care about safety. On the other hand, I expect diverting some of Google and Meta’s funds and compute to Anthropic is net good, but I’m very open to updating here given further info on how Google allocates resources.
I don’t agree with your ‘horribly unethical’ take. I’m not particularly informed here, but my impression was that it’s par-for-the-course to advertise and oversell when pitching to VCs as a startup? Such an industry-wide norm could be seen as entirely unethical, but I don’t personally have such a strong reaction.
I agree it’s common for startups to somewhat oversell their products to investors, but I think it goes far beyond “somewhat”—maybe even beyond the bar for criminal fraud, though I’m not sure—to tell investors you’re aiming to soon get “too far ahead for anyone to catch up in subsequent cycles,” if your actual plan is to avoid getting meaningfully ahead at all.
Not making any claims about actual Anthropic strategy here, but as gwern notes, I don’t think that these are necessarily contradictory. For example, you could have a strategy of getting far enough ahead that new entrants like e.g. Mistral would have a hard time keeping up, but staying on pace with or behind current competitors like e.g. OpenAI.
I assumed “anyone” was meant to include OpenAI—do you interpret it as just describing novel entrants? If so I agree that wouldn’t be contradictory, but it seems like a strange interpretation to me in the context of a pitch deck asking investors for a billion dollars.
I agree that this is a plausible read of their pitch to investors, but I do think it’s a bit of a stretch to consider it the most likely explanation. It’s hard for me to believe that Anthropic would receive billions of dollars in funding if they’re explicitly telling investors that they’re committingto only release equivalent or inferior products relative to their main competitor.
I think it clearly does. From my perspective, Anthropic’s post is misleading either way—either Claude 3 doesn’t outperform its peers, in which case claiming otherwise is misleading, or they are in fact pushing the frontier, in which case they’ve misled people by suggesting that they would not do this.
Also, “We do not believe that model intelligence is anywhere near its limits, and we plan to release frequent updates to the Claude 3 model family over the next few months” doesn’t inspire much confidence that they’re not trying to surpass other models in the near future.
In any case, I don’t see much reason to think that Anthropic is not aiming to push the frontier. For one, to the best of my knowledge they’ve never even publicly stated they wouldn’t; to the extent that people believe it anyway, it is, as best I can tell, mostly just through word of mouth and some vague statements from Dario. Second, it’s hard for me to imagine that they’re pitching investors on a plan that explicitly aims to make an inferior product relative to their competitors. Indeed, their leaked pitch deck suggests otherwise: “We believe that companies that train the best 2025⁄26 models will be too far ahead for anyone to catch up in subsequent cycles.” I think the most straightforward interpretation of this sentence is that Anthropic is racing to build AGI.
And if they are indeed pushing the frontier, this seems like a negative update about them holding to other commitments about safety. Because while it’s true that Anthropic never, to the best of my knowledge, explicitly stated that they wouldn’t do so, they nevertheless appeared to me to strongly imply it. E.g., in his podcast with Dwarkesh, Dario says:
And Dario on an FLI podcast:
None of this is Dario saying that Anthropic won’t try to push the frontier, but it certainly heavily suggests that they are aiming to remain at least slightly behind it. And indeed, my impression is that many people expected this from Anthropic, including people who work there, which seems like evidence that this was the implied message.
If Anthropic is in fact attempting to push the frontier, then I think this is pretty bad. They shouldn’t be this vague and misleading about something this important, especially in a way that caused many people to socially support them (and perhaps make decisions to work there). I perhaps cynically think this vagueness was intentional—it seems implausible to me that Anthropic did not know that people believed this yet they never tried to correct it, which I would guess benefited them: safety-conscious engineers are more likely to work somewhere that they believe isn’t racing to build AGI. Hopefully I’m wrong about at least some of this.
In any case, whether or not Claude 3 already surpasses the frontier, soon will, or doesn’t, I request that Anthropic explicitly clarify whether their intention is to push the frontier.
Neither of your examples seem super misleading to me. I feel like there was some atmosphere of “Anthropic intends to stay behind the frontier” when the actual statements were closer to “stay on the frontier”.
Also worth noting that Claude 3 does not substantially advance the LLM capabilities frontier! Aside from GPQA, it doesn’t do that much better on benchmarks than GPT-4 (and in fact does worse than
gpt-4-1106-preview
). Releasing models that are comparable to models OpenAI released a year ago seems compatible with “staying behind the frontier”, given OpenAI has continued its scale up and will no doubt soon release even more capable models.That being said, I agree that Anthropic did benefit in the EA community by having this impression. So compared to the impression many EAs got from Anthropic, this is indeed a different stance.
As Evan says, I think they clarified their intentions in their RSP: https://www.anthropic.com/news/anthropics-responsible-scaling-policy
The main (only?) limit on scaling is their ability to implement containment/safety measures for ever more dangerous models. E.g.:
That is, they won’t go faster than they can scale up safety procedures, but they’re otherwise fine pushing the frontier.
It’s worth noting that their ASL-3 commitments seem pretty likely to trigger in the next few years, and probably will be substantially difficult to meet:
If one of the effects of instituting a responsible scaling policy was that Anthropic moved from the stance of not meaningfully pushing the frontier to “it’s okay to push the frontier so long as we deem it safe,” this seems like a pretty important shift that was not well communicated. I, for one, did not interpret Anthropic’s RSP as a statement that they were now okay with advancing state of the art, nor did many others; I think that’s because the RSP did not make it clear that they were updating this position. Like, with hindsight I can see how the language in the RSP is consistent with pushing the frontier. But I think the language is also consistent with not pushing it. E.g., when I was operating under the assumption that Anthropic had committed to this, I interpreted the RSP as saying “we’re aiming to scale responsibly to the extent that we scale at all, which will remain at or behind the frontier.”
Attempting to be forthright about this would, imo, look like a clear explanation of Anthropic’s previous stance relative to the new one they were adopting, and their reasons for doing so. To the extent that they didn’t feel the need to do this, I worry that it’s because their previous stance was more of a vibe, and therefore non-binding. But if they were using that vibe to gain resources (funding, talent, etc.), then this seems quite bad to me. It shouldn’t both be the case that they benefit from ambiguity but then aren’t held to any of the consequences of breaking it. And indeed, this makes me pretty wary of other trust/deferral based support that people currently give to Anthropic. I think that if the RSP in fact indicates a departure from their previous stance of not meaningfully pushing the frontier, then this is a negative update about Anthropic holding to the spirit of their commitments.
As one data point: before I joined Anthropic, when I was trying to understand Anthropic’s strategy, I never came away with the impression that Anthropic wouldn’t advance the state of the art. It was quite clear to me that Anthropic’s strategy at the time was more amorphous than that, more like “think carefully about when to do releases and try to advance capabilities for the purpose of doing safety” rather than “never advance the state of the art”. I should also note that now the strategy is actually less amorphous, since it’s now pretty explicitly RSP-focused, more like “we will write RSP commitments that ensure we don’t contribute to catastrophic risk and then scale and deploy only within the confines of the RSP”.
It seems Dario left Dustin Moskovitz with a different impression—that Anthropic had a policy/commitment to not meaningfully advance the frontier:
Well, if Dustin sees no problem in talking about it, and it’s become a major policy concern, then I guess I should disclose that I spent a while talking with Dario back in late October 2022 (ie. pre-RSP in Sept 2023), and we discussed Anthropic’s scaling policy at some length, and I too came away with the same impression everyone else seems to have: that Anthropic’s AI-arms-race policy was to invest heavily in scaling, creating models at or pushing the frontier to do safety research on, but that they would only release access to second-best models & would not ratchet capabilities up, and it would wait for someone else to do so before catching up. So it would not contribute to races but not fall behind and become irrelevant/noncompetitive.
And Anthropic’s release of Claude-1 and Claude-2 always seemed to match that policy—even if Claude-2 had a larger context window for a long time than any other decent available model, Claude-2 was still substantially weaker than ChatGPT-4. (Recall that the causus belli for Sam Altman trying to fire Helen Toner from the OA board was a passing reference in a co-authored paper to Anthropic not pushing the frontier like OA did.)
What I’m concluding from the discussion so far is that I should have read the Anthropic RSP more carefully than I did.
I hear you sometimes share dual-use (or plain capabilities?) ideas with Anthropic. If that’s true, does this change your policy?
Anthropic is in little need of ideas from me, but yeah, I’ll probably pause such things for a while. I’m not saying the RSP is bad, but I’d like to see how things work out.
They indeed did not advance the frontier with this launch (at least not meaningfully, possibly not at all). But “meaningfully advance the frontier” is quite different from both “stay on the frontier” or “slightly push the envelope while creating marketing hype”, which is what I think is going on here?
Yeah, seems plausible; but either way it seems worth noting that Dario left Dustin, Evan and Anthropic’s investors with quite different impressions here.
I interpreted you, in your previous comment, as claiming that Anthropic’s RSP is explicit in its compatibility with meaningfully pushing the frontier. Dustin is under the impression that Anthropic verbally committed otherwise. Whether or not Claude 3 pushed the frontier seems somewhat orthogonal to this question—did Anthropic commit and/or heavily imply that they weren’t going to push the frontier, and if so, does the RSP quietly contradict that commitment? My current read is that the answer to both questions is yes. If this is the case, I think that Anthropic has been pretty misleading about a crucial part of their safety plan, and this seems quite bad to me.
I think that you’re correct that Anthropic at least heavily implied that they weren’t going to “meaningfully advance” the frontier (even if they have not made any explicit commitments about this). I’d be interested in hearing when Dustin had this conversation w/ Dario—was it pre or post RSP release?
And as far as I know, the only commitments they’ve made explicitly are in their RSP, which commits to limiting their ability to scale to the rate at which they can advance and deploy safety measures. It’s unclear if the “sufficient safety measures” limitation is the only restriction on scaling, but I would be surprised if anyone senior Anthropic was willing to make a concrete unilateral commitment to stay behind the curve.
My current story based on public info is, up until mid 2022, there was indeed an intention to stay at the frontier but not push it forward significantly. This changed sometime in late 2022-early 2023, maybe after ChatGPT released and the AGI race became somewhat “hot”.
I feel some kinda missing mood in these comments. It seems like you’re saying “Anthropic didn’t make explicit commitments here”, and that you’re not weighting as particularly important whether they gave people different impressions, or benefited from that.
(AFAICT you haven’t explicitly stated “that’s not a big deal”, but, it’s the vibe I get from your comments. Is that something you’re intentionally implying, or do you think of yourself as mostly just trying to be clear on the factual claims, or something like that?)
The first Dario quote sounds squarely in line with releasing a Claude 3 on par with GPT-4 but well afterwards. The second Dario quote has a more ambiguous connotation, but if read explicitly it strikes me as compatible with the Claude 3 release.
If you spent a while looking for the most damning quotes, then these quotes strike me as evidence the community was just wishfully thinking while in reality Anthropic comms were fairly clear throughout. Privately pitching aggressive things to divert money from more dangerous orgs while minimizing head-on competition with OpenAI seems best to me (though obviously it’s also evidence that they’ll actually do the aggressive scaling things, so hard to know).
To make concrete the disagreement, I’d be interested in people predicting on “If Anthropic releases a GPT-5 equivalent X months behind, then their dollars/compute raised will be Y times lower than OpenAI” for various values of X.
“Diverting money” strikes me as the wrong frame here. Partly because I doubt this actually was the consequence—i.e., I doubt OpenAI etc. had a meaningfully harder time raising capital because of Anthropic’s raise—but also because it leaves out the part where this purported desirable consequence was achieved via (what seems to me like) straightforward deception!
If indeed Dario told investors he hoped to obtain an insurmountable lead soon, while telling Dustin and others that he was committed to avoid gaining any meaningful lead, then it sure seems like one of those claims was a lie. And by my ethical lights, this seems like a horribly unethical thing to lie about, regardless of whether it somehow caused OpenAI to have less money.
I don’t see any direct contradiction/lie there, at least between my version and the investor paraphrase. You don’t necessarily have to release to public general access the best model, in order to be so far ahead that competitors can’t catch up.
For example, LLMs at the research frontier could be a natural (Bertrand?) oligopoly where there’s a stable two-player oligopoly for the best models (#1 by XYZ, and #2 by Anthropic), and everyone else gives up: there is no point in spending $10b to stand up your own model to try to become #1 when XYZ/Anthropic will just cut prices or release the next iteration that they’d been holding back and you get relegated to #3 and there’s no reason for anyone to buy yours instead, and you go bankrupt. (This would be similar to other historical examples like Intel/AMD or Illumina: they enjoyed large margins and competing with them was possible, but was very dangerous because they had a lot of pent-up improvements they could unleash if you spent enough to become a threat. Or in the case of the highly stable iOS/Android mobile duopoly, just being too incredibly capital-intensive to replicate and already low-margin because the creators make their money elsewhere like devices/ads, having commoditized their complement.)
And then presumably at some point you either solve safety or the models are so capable that further improvement is unnecessary or you can’t increase capability; then the need for the AI-arms-race policy is over, and you just do whatever makes pragmatic sense in that brave new world.
It seems plausible that this scenario could happen, i.e., that Anthropic and OpenAI end up in a stable two-player oligopoly. But I would still be pretty surprised if Anthropic’s pitch to investors, when asking for billions of dollars in funding, is that they pre-commit to never release a substantially better product than their main competitor.
How surprising would you say you find the idea of a startup trying to, and successfully raising, not billions but tens of billions of dollars by pitching investors they’re asking that their investment could be canceled at any time at the wave of a hand, the startup pre-commits that the investments will be canceled in the best-case scenario of the product succeeding, & that the investors ought to consider their investment “in the spirit of a donation”?
Absolutely. The increasing cost of training compute and architecture searches, and relatively low cost of inference compute guarantees this. A model that has had more training compute and a better architecture will perform better on more affordable levels of compute across the board. This is also why an Intel or AMD CPU, or Nvidia GPU, is more worth the same amount of silicon than an inferior product.
Wonder why it’s a stable two-player oligopoly and not a straight monopoly? From large corporate buyers preventing a monopoly by buying enough from the 2nd place player to keep them afloat?
Note that this situation is not ideal for Nvidia. This only sells 2 sets of training compute clusters sufficient to move the SOTA forward. Why sell 2 when you can sell at least 66? https://blogs.nvidia.com/blog/world-governments-summit/
The reasoning driving it being a government cannot really trust someone else’s model, everyone needs their own.
I agree that most investment wouldn’t have otherwise gone to OAI. I’d speculate that investments from VCs would likely have gone to some other AI startup which doesn’t care about safety; investments from Google (and other big tech) would otherwise have gone into their internal efforts. I agree that my framing was reductive/over-confident and that plausibly the modal ‘other’ AI startup accelerates capabilities less than Anthropic even if they don’t care about safety. On the other hand, I expect diverting some of Google and Meta’s funds and compute to Anthropic is net good, but I’m very open to updating here given further info on how Google allocates resources.
I don’t agree with your ‘horribly unethical’ take. I’m not particularly informed here, but my impression was that it’s par-for-the-course to advertise and oversell when pitching to VCs as a startup? Such an industry-wide norm could be seen as entirely unethical, but I don’t personally have such a strong reaction.
I agree it’s common for startups to somewhat oversell their products to investors, but I think it goes far beyond “somewhat”—maybe even beyond the bar for criminal fraud, though I’m not sure—to tell investors you’re aiming to soon get “too far ahead for anyone to catch up in subsequent cycles,” if your actual plan is to avoid getting meaningfully ahead at all.
Not making any claims about actual Anthropic strategy here, but as gwern notes, I don’t think that these are necessarily contradictory. For example, you could have a strategy of getting far enough ahead that new entrants like e.g. Mistral would have a hard time keeping up, but staying on pace with or behind current competitors like e.g. OpenAI.
I assumed “anyone” was meant to include OpenAI—do you interpret it as just describing novel entrants? If so I agree that wouldn’t be contradictory, but it seems like a strange interpretation to me in the context of a pitch deck asking investors for a billion dollars.
I agree that this is a plausible read of their pitch to investors, but I do think it’s a bit of a stretch to consider it the most likely explanation. It’s hard for me to believe that Anthropic would receive billions of dollars in funding if they’re explicitly telling investors that they’re committing to only release equivalent or inferior products relative to their main competitor.