The major shift in the next 3 years will be that, as a rule, top level AI labs will not release their best models. I’m certain this has somewhat been the case for OpenAI, Anthropic and Google for the past year. At some point full utilization of a SOTA model will be a strategic advantage for companies themselves to use for their own tactical purposes. The moment any $X of value can be netted from an output/inference run of a model for less than $(X-Y) in costs, where Y represents the marginal labor/maintenance/averaged risk costs for each run’s output, no company would ever be advantaged by releasing the model to be used by anyone other than themselves. This closed-source event horizon I imagine will occur sometime in late 2024.
This is a very good, and very scary point—another thing that could provide, at least the appearance of, a discontinuity. One symptom of this this scenario would be a widespread, false belief that “open source” models are SOTA.
Might be good to brainstorm other symptoms to prime ourselves to recognize when we are in this scenario. Complete hiring-freezes/massive layoffs at the firms in question, aggressive expansion into previously-unrelated markets, etc.
Not sure I understand; if model runs generate value for the creator company, surely they’d also create value that lots of customers would be willing to pay for. If every model run generates value, and there’s ability to scale, then why not maximize revenue by maximizing the number of people using the model? The creator company can just charge the customers, no? Sure, competitors can use it too, but does that really override losing an enormous market of customers?
That’s very true, but there are two reasons why a company may not be inclined to release an extremely capable model: 1. Safety risk: someone uses a model and jailbreaks it in some unexpected way, the risk of misuse is much higher with a more capable model. OpenAI had GPT-4 for 9-10 months before releasing it trying to RHLF and even lobotomized it to being more safe. The Summer 2022 internal version of GPT-4 was, according to Microsoft researchers, more generally capable than the released version (as evidenced by the draw a unicorn test). This needed delay and assumed risks will naturally be much greater with a larger model, both in that larger models, so far, seem harder to simply RHLF into unjailbreakability, and by being more capable, any jailbreak carries more risk, thus the general business level margin of safety will be higher.
2. Sharing/exposing capabilities: Any business wants to maintain a strategic advantage. Releasing a SOTA model will allow a company’s competitors to use it, test its capabilities and train models on its outputs. This reality has become more apparent in the past 12 months.
It does seem to me a little silly to give competitors API access to your brain. If one has enough of a lead, one can just capture your competitors markets.
The major shift in the next 3 years will be that, as a rule, top level AI labs will not release their best models. I’m certain this has somewhat been the case for OpenAI, Anthropic and Google for the past year. At some point full utilization of a SOTA model will be a strategic advantage for companies themselves to use for their own tactical purposes. The moment any $X of value can be netted from an output/inference run of a model for less than $(X-Y) in costs, where Y represents the marginal labor/maintenance/averaged risk costs for each run’s output, no company would ever be advantaged by releasing the model to be used by anyone other than themselves. This closed-source event horizon I imagine will occur sometime in late 2024.
Related previous discussion:
Soft takeoff can still lead to decisive strategic advantage — AI Alignment Forum
Review of Soft Takeoff Can Still Lead to DSA — AI Alignment Forum
This is a very good, and very scary point—another thing that could provide, at least the appearance of, a discontinuity. One symptom of this this scenario would be a widespread, false belief that “open source” models are SOTA.
Might be good to brainstorm other symptoms to prime ourselves to recognize when we are in this scenario. Complete hiring-freezes/massive layoffs at the firms in question, aggressive expansion into previously-unrelated markets, etc.
Not sure I understand; if model runs generate value for the creator company, surely they’d also create value that lots of customers would be willing to pay for. If every model run generates value, and there’s ability to scale, then why not maximize revenue by maximizing the number of people using the model? The creator company can just charge the customers, no? Sure, competitors can use it too, but does that really override losing an enormous market of customers?
That’s very true, but there are two reasons why a company may not be inclined to release an extremely capable model:
1. Safety risk: someone uses a model and jailbreaks it in some unexpected way, the risk of misuse is much higher with a more capable model. OpenAI had GPT-4 for 9-10 months before releasing it trying to RHLF and even lobotomized it to being more safe. The Summer 2022 internal version of GPT-4 was, according to Microsoft researchers, more generally capable than the released version (as evidenced by the draw a unicorn test). This needed delay and assumed risks will naturally be much greater with a larger model, both in that larger models, so far, seem harder to simply RHLF into unjailbreakability, and by being more capable, any jailbreak carries more risk, thus the general business level margin of safety will be higher.
2. Sharing/exposing capabilities: Any business wants to maintain a strategic advantage. Releasing a SOTA model will allow a company’s competitors to use it, test its capabilities and train models on its outputs. This reality has become more apparent in the past 12 months.
It does seem to me a little silly to give competitors API access to your brain. If one has enough of a lead, one can just capture your competitors markets.