As usual, the part that seems bonkers crazy is where they claim the best thing they can do is keep making every scrap of capabilities progress they can. Keep making AI as smart as possible, as fast as possible.
“This margin is too small to contain our elegant but unintuitive reasoning for why”. Grump. Let’s please have a real discussion about this some time.
Okay, I’ll try to steelman the argument. Some of this comes from OpenAI and Altman’s posts; some of it is my addition.
Allowing additional compute overhang increases the likely speed of takeoff. If AGI through LLMs is possible, and that isn’t discovered for another 5 years, it might be achieved in the first go, with no public discussion and little alignment effort.
LLMs might be the most-alignable form of AGI. They are inherently oracles, and cognitive architectures made from them have the huge advantage of natural language alignment and vastly better interpretability than other deep network approaches. I’ve written about this in Capabilities and alignment of LLM cognitive architectures. I’m eager to learn I’m wrong, but in the meantime I actually think (for reasons spelled out there and to be elaborated in future posts) that pushing capabilities of LLMs and cognitive architectures is our best hope for achieving alignment, even if that speeds up timelines. Under this logic, slowing down LLM progress would be dangerous, as other approaches like RL agents would pass them by before appearing dangerous.
Edit: so in sum, I think their logic is obviously self-serving, but actually pretty solid when it’s steelmanned. I intend to keep pushing this discussion in future posts.
I really should have something short to say, that turns the whole argument on its head, given how clear-cut it seems to me. I don’t have that yet, but I do have some rambly things to say.
I basically don’t think overhangs are a good way to think about things, because the bridge that connects an “overhang” to an outcome like “bad AI” seems flimsy to me. I would like to see a fuller explication some time from OpenAI (or a suitable steelman!) that can be critiqued. But here are some of my thoughts.
The usual argument that leads from “overhang” to “we all die” has some imaginary other actor who is scaling up their methods with abandon at the end, killing us all because it’s not hard to scale and they aren’t cautious. This is then used to justify scaling up your own method with abandon, hoping that we’re not about to collectively fall off a cliff.
For one thing, the hype and work being done now is making this problem a lot worse at all future timesteps. There was (and still is) a lot people need to figure out regarding effectively using lots of compute. (For instance, architectures that can be scaled up, training methods and hyperparameters, efficient compute kernels, putting together datacenters and interconnect, data, etc etc.) Every chipmaker these days has started working on things with a lot of memory right next to a lot compute with a tonne of bandwidth, tailored to these large models. These are barriers-to-entry that it would have been better to leave in place, if one was concerned with rapid capability gains. And just publishing fewer things and giving out fewer hints would have helped.
Another thing: I would take the whole argument as being more in good-faith if I saw attempts being made to scale up anything other than capabilities at high speed, or signs that made it seem at all likely that “alignment” might be on track. Examples:
A single alignment result that was supported by a lot of OpenAI staff. (Compare and contrast the support that the alignment team’s projects get to what a main training run gets.)
Any focus on trying to claw cognition back out of the giant inscrutable floating-point numbers, into a domain easier to understand, rather than pouring more power into the systems that get much harder to inspect as you scale them. (Failure to do this suggests OpenAI and others are mostly just doing what they know how to do, rather than grappling with navigating us toward better AI foundations.)
Any success in understanding how shallow vs deep the thinking of the LLMs is, in the sense of “how long a chain of thoughts/inferences can it make as it composes dialogue”, and how this changes with scale. (Since the whole “LLMs are safer” thing relies on their thinking being coupled to the text they output; otherwise you’re back in giant inscrutable RL agent territory)
The delta between “intelligence embedded somewhere in the system” and “intelligence we can make use of” looking smaller than it does. (Since if our AI gets to use of more of its intelligence than us, and this gets worse as we scale, this looks pretty bad for the “use our AI to tame the AI before it’s too late” plan.)
Also I can’t make this point precisely, but I think there’s something like capabilities progress just leaves more digital fissile material lying around the place, especially when published and hyped. And if you don’t want “fast takeoff”, you want less fissile material lying around, lest it get assembled into something dangerous.
Finally, to more directly talk about LLMs, my crux for whether they’re “safer” than some hypothetical alternative is about how much of the LLM “thinking” is closely bound to the text being read/written. My current read is that they’re more like doing free-form thinking inside, that tries to concentrate mass on right prediction. As we scale that up, I worry that any “strange competence” we see emerging is due to the LLM having something like a mind inside, and less due to it having accrued more patterns.
That is indeed a lot of points. Let me try to parse them and respond, because I think this discussion is critically important.
Point 1: overhang.
Your first two paragraphs seem to be pointing to downsides of progress, and saying that it would be better if nobody made that progress. I agree. We don’t have guaranteed methods of alignment, and I think our odds of survival would be much better if everyone went way slower on developing AGI.
The standard thinking, which could use more inspection, but which I agree with, is that this is simply not going to happen. Individuals that decide to step aside are slowing progress only slightly. This leaves compute overhang that someone else is going to take advantage of, with nearly the competence, and only slightly slower. Those individuals who pick up the banner and create AGI will not be infinitely reckless, but the faster progress from that overhang will make whatever level of caution they have less effective.
This is a separate argument from regulation. Adequate regulation will slow progress universally, rather than leaving it up to the wisdom and conscience of every individual who might decide to develop AGI.
I don’t think it’s impossible to slow and meter progress so that overhang isn’t an issue. But I think it is effectively even harder than alignment. We have decent suggestions on the table for alignment now, and as far as I know, no equally promising suggestions for getting everyone (and it does take almost everyone coordinating) to pass up the immense opportunities offered by capabilities overhangs.
Point 2: Are LLMs safer than other approaches?
I agree that this is a questionable proposition. I think it’s worth questioning. Aiming progress at easier-to-align approaches seems highly worthwhile.
I agree that an LLM may have something like a mind inside. I think current versions are almost certainly too dumb to be existentially dangerous (at least directly—if a facebook algorithm can nearly cause an insurrection, who knows what dangerous side effects any AI can have).
I’m less worried about GPT10 playing a superintelligent, Waluigi-collapsed villain than I am about a GPT6 that has been amplified to agency, situational awareness, and weak superintelligence by scaffolding it into something like a cognitive architecture. I think this type of advance is inevitable. ChatGPT extensions and Bing Chat both use internal prompting to boost intelligence, and approaches like SmartGPT and Tree of Thoughts massively improve benchmark results over the base LLM.
Fortunately, this direction also has huge advantages for alignment. It has a very low alignment tax, since you give them additional goals in natural language, like “support human empowerment” or whatever the SOTA alignment goal is. And they have vastly better interpretability since they’re at least summarizing their thoughts in natural language.
Here’s where your skepticism that they’re being honest about summarizing those thoughts comes into full force. I agree that it’s not reliable; for instance, changing the intermediate answer in chain of thought prompting often doesn’t change the final output, indicating that that output was for show.
However, a safer setup is to never use the same model twice. When you use chain-of-thought reasoning, construct a new context with the relevant information from memory; don’t just let the context window accrue, since this allows fake chains-of-thought and the collapse of the simulator into a waluigi.
Scaffolded LLMs should not turn an LLM into an agent, but rather create a committee of LLMs that are called for individual questions needed to accomplish that committee’s goals.
This isn’t remotely a solution to the alignment problem, but it really seems to have massive upsides, and only the same downsides as other practically viable approaches to AGI.
To be clear, I only see some form of RL agents as the other practical possibility, and I like our odds much less with those.
I think there are other, even more readily alignable approaches to AGI. But they all seem wildly impractical. I think we need to get ready to align the AGI we get, rather than just preparing to say I-told-you-so after the world refuses to forego massive incentives to take a much slower but safer route to AGI.
To paraphrase, we need to go to the alignment war with the AGI we get, not the AGI we want.
slowing down LLM progress would be dangerous, as other approaches like RL agents would pass them by before appearing dangerous.
This seems misleading to me & might be a false dichotomy. It’s not LLMs or RL agents. I think we’ll (unfortunately) build agents on the basis of LLMs & the capabilities they have. Every additional progress on LLMs gives these agents more capabilities faster with less time for alignment. They will be (and are!) built based on the mere (perceived) incentives of everybody involved & the unilateralist curse. (See esp. Gwern’s Tool AIs want to be Agent AIs.) I can see that such agents have interpretability advantages over RL agents but since RL agents seem far off with less work going into it, I don’t get why we should race regarding LLMs & LLM-based agents.
I’m personally not sure, if “inherently oracles” is accurate for current LLMs (both before & after RLHF), but it seems simply false when considering plugins & AutoGPT (besides other recent stuff).
I was unclear. I meant that basic LLMs are oracles. The rest of what I said was about the agents made from LLMs you refer to. They are most certainly agents and not oracles. But they’re way better for alignment than RL agents. See my linked post for more on that.
I don’t think this is a fair consideration of the article’s entire message. This line from the article specifically calls out slowing down AI progress:
we could collectively agree (with the backing power of a new organization like the one suggested below) that the rate of growth in AI capability at the frontier is limited to a certain rate per year.
Having spent a long time reading through OpenAI’s statements, I suspect that they are trying to strike a difficult balance between:
A) Doing the right thing by way of AGI safety (including considering options like slowing down or not releasing certain information and technology).
B) Staying at or close to the lead of the race to AGI, given they believe that is the position from which they can have the most positive impact in terms of changing the development path and broader conversation around AGI.
Instrumental goal (B) is in tension (but not necessarily stark conflict, depending on how things play out) with ultimate goal (A).
What they’re presenting here in this article are ways to potentially create situation where they could slow down and be confident that doing so wouldn’t actually lead to worse eventual outcomes for AGI safety. They are also trying to promote and escalate the societal conversation around AGI x-risk.
While I think it’s totally valid to criticise OAI on aspects of their approach to AGI safety, I think it’s also fair to say that they are genuinely trying to do the right thing and are simply struggling to chart what is ultimately a very difficult path.
Yeah I think my complaint is that OpenAI seems to be asserting almost a “boundary” re goal (B), like there’s nothing that trades off against staying at the front of the race, and they’re willing to pay large costs rather than risk being the second-most-impressive AI lab. Why? Things don’t add up.
(Example large cost: they’re not putting large organizational attention to the alignment problem. The alignment team projects don’t have many people working on them, they’re not doing things like inviting careful thinkers to evaluate their plans under secrecy, or taking any other bunch of obvious actions that come from putting serious resources into not blowing everyone up.)
I don’t buy that (B) is that important. It seems more driven by some strange status / narrative-power thing? And I haven’t ever seen them make an explicit their case for why they’re sacrificing so much for (B). Especially when a lot of their original safety people fucking left due to some conflict around this?
Broadly many things about their behaviour strike me as deceptive / making it hard to form a counternarrative / trying to conceal something odd about their plans.
One final question: why do they say “we think it would be good if an international agency limited compute growth” but not also “and we will obviously be trying to partner with other labs to do this ourselves in the meantime, although not if another lab is already training something more powerful than GPT-4″?
Well to be fair to Microsoft/OpenAI, they are a for-profit corporation, they can’t exactly say “and we will limit the future prospects of our business beyond X threshold”.
And since there are many such organizations on Earth, and they’re not going away anytime soon, race dynamics would overtake them even if they did issue such a statement and commit to it.
The salient question is before all this, how can truly global, truly effective coordination be achieved? At what cost? And is this cost bearable to the decision makers and wider population?
My personal opinion is that given current geopolitical tensions, it’s exceedingly unlikely this will occur before a mega-disaster actually happens, thus there might be some merit in an alternate approach.
That cap is very high, something like 1000x investment. They’re not near it, so they could be sued by investors if they admitted to slowing down even a little.
The whole scheme for OpenAI is nuts, but I think they’re getting less nuts as they think more about the issue. Which is weak praise.
I kinda reject the energy of the hypothetical? But I can speak to some things I wish I saw OpenAI doing:
Having some internal sense amongst employees about whether they’re doing something “good” given the stakes, like Google’s old “don’t be evil” thing. Have a culture of thinking carefully about things and managers taking considerations seriously, rather than something more like management trying to extract as much engineering as quickly as possible without “drama” getting in the way.
(Perhaps they already have a culture like this! I haven’t worked there. But my prediction is that it is not, and the org has a more “extractive” relationship to its employees. I think that this is bad, causes working toward danger, and exacerbates bad outcomes.)
To the extent that they’re trying to have the best AGI tech in order to provide “leadership” of humanity and AI, I want to see them be less shady / marketing / spreading confusion about the stakes.
They worked to pervert the term “alignment” to be about whether you can extract more value from their LLMs, and distract from the idea that we might make digital minds that are copyable and improvable, while also large and hard to control. (While pushing directly on AGI designs that have the “large and hard to control” property, which I guess they’re denying is a mistake, but anyhow.)
I would like to see less things perverted/distracted/confused, like it’s according-to-me entirely possible for them to state more clearly what the end of all this is, and be more explicit about how they’re trying to lead the effort.
Reconcile with Anthropic. There is no reason, speaking on humanity’s behalf, to risk two different trajectories of giant LLMs built with subtly different technology, while dividing up the safety know-how amidst both organizations.
Furthermore, I think OpenAI kind-of stole/appropriated the scaling idea from the Anthropic founders, who left when they lost a political battle about the direction of the org. I suspect it was a huge fuck-you when OpenAI tried to spread this secret to the world, and continued to grow their org around it, while ousting the originators. If my model is at-all-accurate, I don’t like it, and OpenAI should look to regain “good standing” by acknowledging this (perhaps just privately), and looking to cooperate.
Idk, maybe it’s now legally impossible/untenable for the orgs to work together, given the investors or something? Or given mutual assumption of bad-faith? But in any case this seems really shitty.
I also mentioned some other things in this comment.
As usual, the part that seems bonkers crazy is where they claim the best thing they can do is keep making every scrap of capabilities progress they can. Keep making AI as smart as possible, as fast as possible.
“This margin is too small to contain our elegant but unintuitive reasoning for why”. Grump. Let’s please have a real discussion about this some time.
Okay, I’ll try to steelman the argument. Some of this comes from OpenAI and Altman’s posts; some of it is my addition.
Allowing additional compute overhang increases the likely speed of takeoff. If AGI through LLMs is possible, and that isn’t discovered for another 5 years, it might be achieved in the first go, with no public discussion and little alignment effort.
LLMs might be the most-alignable form of AGI. They are inherently oracles, and cognitive architectures made from them have the huge advantage of natural language alignment and vastly better interpretability than other deep network approaches. I’ve written about this in Capabilities and alignment of LLM cognitive architectures. I’m eager to learn I’m wrong, but in the meantime I actually think (for reasons spelled out there and to be elaborated in future posts) that pushing capabilities of LLMs and cognitive architectures is our best hope for achieving alignment, even if that speeds up timelines. Under this logic, slowing down LLM progress would be dangerous, as other approaches like RL agents would pass them by before appearing dangerous.
Edit: so in sum, I think their logic is obviously self-serving, but actually pretty solid when it’s steelmanned. I intend to keep pushing this discussion in future posts.
I really should have something short to say, that turns the whole argument on its head, given how clear-cut it seems to me. I don’t have that yet, but I do have some rambly things to say.
I basically don’t think overhangs are a good way to think about things, because the bridge that connects an “overhang” to an outcome like “bad AI” seems flimsy to me. I would like to see a fuller explication some time from OpenAI (or a suitable steelman!) that can be critiqued. But here are some of my thoughts.
The usual argument that leads from “overhang” to “we all die” has some imaginary other actor who is scaling up their methods with abandon at the end, killing us all because it’s not hard to scale and they aren’t cautious. This is then used to justify scaling up your own method with abandon, hoping that we’re not about to collectively fall off a cliff.
For one thing, the hype and work being done now is making this problem a lot worse at all future timesteps. There was (and still is) a lot people need to figure out regarding effectively using lots of compute. (For instance, architectures that can be scaled up, training methods and hyperparameters, efficient compute kernels, putting together datacenters and interconnect, data, etc etc.) Every chipmaker these days has started working on things with a lot of memory right next to a lot compute with a tonne of bandwidth, tailored to these large models. These are barriers-to-entry that it would have been better to leave in place, if one was concerned with rapid capability gains. And just publishing fewer things and giving out fewer hints would have helped.
Another thing: I would take the whole argument as being more in good-faith if I saw attempts being made to scale up anything other than capabilities at high speed, or signs that made it seem at all likely that “alignment” might be on track. Examples:
A single alignment result that was supported by a lot of OpenAI staff. (Compare and contrast the support that the alignment team’s projects get to what a main training run gets.)
Any focus on trying to claw cognition back out of the giant inscrutable floating-point numbers, into a domain easier to understand, rather than pouring more power into the systems that get much harder to inspect as you scale them. (Failure to do this suggests OpenAI and others are mostly just doing what they know how to do, rather than grappling with navigating us toward better AI foundations.)
Any success in understanding how shallow vs deep the thinking of the LLMs is, in the sense of “how long a chain of thoughts/inferences can it make as it composes dialogue”, and how this changes with scale. (Since the whole “LLMs are safer” thing relies on their thinking being coupled to the text they output; otherwise you’re back in giant inscrutable RL agent territory)
The delta between “intelligence embedded somewhere in the system” and “intelligence we can make use of” looking smaller than it does. (Since if our AI gets to use of more of its intelligence than us, and this gets worse as we scale, this looks pretty bad for the “use our AI to tame the AI before it’s too late” plan.)
Also I can’t make this point precisely, but I think there’s something like capabilities progress just leaves more digital fissile material lying around the place, especially when published and hyped. And if you don’t want “fast takeoff”, you want less fissile material lying around, lest it get assembled into something dangerous.
Finally, to more directly talk about LLMs, my crux for whether they’re “safer” than some hypothetical alternative is about how much of the LLM “thinking” is closely bound to the text being read/written. My current read is that they’re more like doing free-form thinking inside, that tries to concentrate mass on right prediction. As we scale that up, I worry that any “strange competence” we see emerging is due to the LLM having something like a mind inside, and less due to it having accrued more patterns.
That is indeed a lot of points. Let me try to parse them and respond, because I think this discussion is critically important.
Point 1: overhang.
Your first two paragraphs seem to be pointing to downsides of progress, and saying that it would be better if nobody made that progress. I agree. We don’t have guaranteed methods of alignment, and I think our odds of survival would be much better if everyone went way slower on developing AGI.
The standard thinking, which could use more inspection, but which I agree with, is that this is simply not going to happen. Individuals that decide to step aside are slowing progress only slightly. This leaves compute overhang that someone else is going to take advantage of, with nearly the competence, and only slightly slower. Those individuals who pick up the banner and create AGI will not be infinitely reckless, but the faster progress from that overhang will make whatever level of caution they have less effective.
This is a separate argument from regulation. Adequate regulation will slow progress universally, rather than leaving it up to the wisdom and conscience of every individual who might decide to develop AGI.
I don’t think it’s impossible to slow and meter progress so that overhang isn’t an issue. But I think it is effectively even harder than alignment. We have decent suggestions on the table for alignment now, and as far as I know, no equally promising suggestions for getting everyone (and it does take almost everyone coordinating) to pass up the immense opportunities offered by capabilities overhangs.
Point 2: Are LLMs safer than other approaches?
I agree that this is a questionable proposition. I think it’s worth questioning. Aiming progress at easier-to-align approaches seems highly worthwhile.
I agree that an LLM may have something like a mind inside. I think current versions are almost certainly too dumb to be existentially dangerous (at least directly—if a facebook algorithm can nearly cause an insurrection, who knows what dangerous side effects any AI can have).
I’m less worried about GPT10 playing a superintelligent, Waluigi-collapsed villain than I am about a GPT6 that has been amplified to agency, situational awareness, and weak superintelligence by scaffolding it into something like a cognitive architecture. I think this type of advance is inevitable. ChatGPT extensions and Bing Chat both use internal prompting to boost intelligence, and approaches like SmartGPT and Tree of Thoughts massively improve benchmark results over the base LLM.
Fortunately, this direction also has huge advantages for alignment. It has a very low alignment tax, since you give them additional goals in natural language, like “support human empowerment” or whatever the SOTA alignment goal is. And they have vastly better interpretability since they’re at least summarizing their thoughts in natural language.
Here’s where your skepticism that they’re being honest about summarizing those thoughts comes into full force. I agree that it’s not reliable; for instance, changing the intermediate answer in chain of thought prompting often doesn’t change the final output, indicating that that output was for show.
However, a safer setup is to never use the same model twice. When you use chain-of-thought reasoning, construct a new context with the relevant information from memory; don’t just let the context window accrue, since this allows fake chains-of-thought and the collapse of the simulator into a waluigi.
Scaffolded LLMs should not turn an LLM into an agent, but rather create a committee of LLMs that are called for individual questions needed to accomplish that committee’s goals.
This isn’t remotely a solution to the alignment problem, but it really seems to have massive upsides, and only the same downsides as other practically viable approaches to AGI.
To be clear, I only see some form of RL agents as the other practical possibility, and I like our odds much less with those.
I think there are other, even more readily alignable approaches to AGI. But they all seem wildly impractical. I think we need to get ready to align the AGI we get, rather than just preparing to say I-told-you-so after the world refuses to forego massive incentives to take a much slower but safer route to AGI.
To paraphrase, we need to go to the alignment war with the AGI we get, not the AGI we want.
This seems misleading to me & might be a false dichotomy. It’s not LLMs or RL agents. I think we’ll (unfortunately) build agents on the basis of LLMs & the capabilities they have. Every additional progress on LLMs gives these agents more capabilities faster with less time for alignment. They will be (and are!) built based on the mere (perceived) incentives of everybody involved & the unilateralist curse. (See esp. Gwern’s Tool AIs want to be Agent AIs.) I can see that such agents have interpretability advantages over RL agents but since RL agents seem far off with less work going into it, I don’t get why we should race regarding LLMs & LLM-based agents.
I’m personally not sure, if “inherently oracles” is accurate for current LLMs (both before & after RLHF), but it seems simply false when considering plugins & AutoGPT (besides other recent stuff).
I was unclear. I meant that basic LLMs are oracles. The rest of what I said was about the agents made from LLMs you refer to. They are most certainly agents and not oracles. But they’re way better for alignment than RL agents. See my linked post for more on that.
I don’t think this is a fair consideration of the article’s entire message. This line from the article specifically calls out slowing down AI progress:
Having spent a long time reading through OpenAI’s statements, I suspect that they are trying to strike a difficult balance between:
A) Doing the right thing by way of AGI safety (including considering options like slowing down or not releasing certain information and technology).
B) Staying at or close to the lead of the race to AGI, given they believe that is the position from which they can have the most positive impact in terms of changing the development path and broader conversation around AGI.
Instrumental goal (B) is in tension (but not necessarily stark conflict, depending on how things play out) with ultimate goal (A).
What they’re presenting here in this article are ways to potentially create situation where they could slow down and be confident that doing so wouldn’t actually lead to worse eventual outcomes for AGI safety. They are also trying to promote and escalate the societal conversation around AGI x-risk.
While I think it’s totally valid to criticise OAI on aspects of their approach to AGI safety, I think it’s also fair to say that they are genuinely trying to do the right thing and are simply struggling to chart what is ultimately a very difficult path.
Yeah I think my complaint is that OpenAI seems to be asserting almost a “boundary” re goal (B), like there’s nothing that trades off against staying at the front of the race, and they’re willing to pay large costs rather than risk being the second-most-impressive AI lab. Why? Things don’t add up.
(Example large cost: they’re not putting large organizational attention to the alignment problem. The alignment team projects don’t have many people working on them, they’re not doing things like inviting careful thinkers to evaluate their plans under secrecy, or taking any other bunch of obvious actions that come from putting serious resources into not blowing everyone up.)
I don’t buy that (B) is that important. It seems more driven by some strange status / narrative-power thing? And I haven’t ever seen them make an explicit their case for why they’re sacrificing so much for (B). Especially when a lot of their original safety people fucking left due to some conflict around this?
Broadly many things about their behaviour strike me as deceptive / making it hard to form a counternarrative / trying to conceal something odd about their plans.
One final question: why do they say “we think it would be good if an international agency limited compute growth” but not also “and we will obviously be trying to partner with other labs to do this ourselves in the meantime, although not if another lab is already training something more powerful than GPT-4″?
Well to be fair to Microsoft/OpenAI, they are a for-profit corporation, they can’t exactly say “and we will limit the future prospects of our business beyond X threshold”.
And since there are many such organizations on Earth, and they’re not going away anytime soon, race dynamics would overtake them even if they did issue such a statement and commit to it.
The salient question is before all this, how can truly global, truly effective coordination be achieved? At what cost? And is this cost bearable to the decision makers and wider population?
My personal opinion is that given current geopolitical tensions, it’s exceedingly unlikely this will occur before a mega-disaster actually happens, thus there might be some merit in an alternate approach.
They actually explicitly set up their business like this, as a capped-profits company
On-paper for what is now a subsidiary of a much larger company.
In practice Microsoft management can’t say now there is a cap on the most promising future business area because of their subsidiary.
This is how management-board-shareholder dynamics of a big company works.
I didn’t spell it out as this is a well known aspect of OpenAI.
That cap is very high, something like 1000x investment. They’re not near it, so they could be sued by investors if they admitted to slowing down even a little.
The whole scheme for OpenAI is nuts, but I think they’re getting less nuts as they think more about the issue. Which is weak praise.
Out of interest—if you had total control over OpenAI—what would you want them to do?
I kinda reject the energy of the hypothetical? But I can speak to some things I wish I saw OpenAI doing:
Having some internal sense amongst employees about whether they’re doing something “good” given the stakes, like Google’s old “don’t be evil” thing. Have a culture of thinking carefully about things and managers taking considerations seriously, rather than something more like management trying to extract as much engineering as quickly as possible without “drama” getting in the way.
(Perhaps they already have a culture like this! I haven’t worked there. But my prediction is that it is not, and the org has a more “extractive” relationship to its employees. I think that this is bad, causes working toward danger, and exacerbates bad outcomes.)
To the extent that they’re trying to have the best AGI tech in order to provide “leadership” of humanity and AI, I want to see them be less shady / marketing / spreading confusion about the stakes.
They worked to pervert the term “alignment” to be about whether you can extract more value from their LLMs, and distract from the idea that we might make digital minds that are copyable and improvable, while also large and hard to control. (While pushing directly on AGI designs that have the “large and hard to control” property, which I guess they’re denying is a mistake, but anyhow.)
I would like to see less things perverted/distracted/confused, like it’s according-to-me entirely possible for them to state more clearly what the end of all this is, and be more explicit about how they’re trying to lead the effort.
Reconcile with Anthropic. There is no reason, speaking on humanity’s behalf, to risk two different trajectories of giant LLMs built with subtly different technology, while dividing up the safety know-how amidst both organizations.
Furthermore, I think OpenAI kind-of stole/appropriated the scaling idea from the Anthropic founders, who left when they lost a political battle about the direction of the org. I suspect it was a huge fuck-you when OpenAI tried to spread this secret to the world, and continued to grow their org around it, while ousting the originators. If my model is at-all-accurate, I don’t like it, and OpenAI should look to regain “good standing” by acknowledging this (perhaps just privately), and looking to cooperate.
Idk, maybe it’s now legally impossible/untenable for the orgs to work together, given the investors or something? Or given mutual assumption of bad-faith? But in any case this seems really shitty.
I also mentioned some other things in this comment.