In a private slack someone extended credit to Sam Altman for putting EAs on the on the OpenAI board originally, especially that this turned out to be pretty risky / costly for him.
I responded:
It seems to me that there were AI safety people on the board at all is fully explainable by strategic moves from an earlier phase of the game.
Namely, OpenAI traded a boardseat for OpenPhil grant money, and more importantly, OpenPhil endorsement, which translated into talent sourcing and effectively defused what might have been vocal denouncement from one of the major intellectually influential hubs of the world.
No one knows how counterfactual history might have developed, but it doesn’t seem unreasonable to think that there is an external world in which the EA culture successfully created a narrative that groups trying to build AGI were bad and defecting.
He’s the master at this game and not me, but I would bet at even odds that Sam was actively tracking EA as a potential social threat that could dampen OpenAI’s narrative flywheel.
I don’t know that OpenPhil’s grant alone was sufficient to switch from the “EAs vocally decry OpenAI as making the world worse” equilibrium to a “largely (but not universally) thinking that OpenAI is bad in private, but mostly staying silent in public + going to work at OpenAI” equilibrium. But I think it was a major component. OpenPhil’s cooperation bought moral legitimacy for OpenAI amongst EAs.
In retrospect, it looks like OpenAI successfully bought out the EAs through OpenPhil, to a lesser extent through people like Paul.
And Ilya in particular was a founder and one of the core technical leads. It makes sense for him to be a board member, and my understanding (someone correct me) is that he grew to think that safety was more important over time, rather than starting out as an “AI safety person”.
And even so, the rumor is that the thing that triggered the Coup is that Sam maneuvered to get Helen removed. I highly doubt that Sam planned for a situation where he was removed as CEO, and then did some crazy jujitsu move with the whole company where actually he ends up firing the board instead. But if you just zoom out and look at what actually played out, he clearly came out ahead, with control consolidated. Which is the outcome that he was maybe steering towards all along?
So my first pass summary of the situation is that when OpenAI was small and of only medium fame and social power, Sam maneuvered to get the cooperation of EAs, because that defused a major narrative threat, and bought the company moral legitimacy (when that legitimacy was more uncertain). Then after ChatGPT and GPT-4, when OpenAI is rich and famous and has more narrative power than the EAs, Sam moves to remove the people that he made those prestige-trades with in the earlier phase, since he no longer needs their support, and doesn’t have any reason for them to have power over the now-force-to-be-reckoned-with company.
Granted, I’m far from all of this and don’t have confidence about any of these political games. But it seems wrong to me to give Sam points for putting “AI safety people” on the board.
More cynical take based on the Musk/Altman emails: Altman was expecting Musk to be CEO. He set up a governance structure which would effectively be able to dethrone Musk, with him as the obvious successor, and was happy to staff the board with ideological people who might well take issue with something Musk did down the line to give him a shot at the throne.
Musk walked away, and it would’ve been too weird to change his mind on the governance structure. Altman thought this trap wouldn’t fire with high enough probability to disarm it at any time before it did.
I don’t know whether the dates line up to dis-confirm this, but I could see this kind of 5d chess move happening. Though maybe normal power and incentive psychological things are sufficient.
@Alexander Gietelink Oldenziel, you put a soldier mindset react on this (and also my earlier, similar, comment this week).
What makes you think so?
Definitely this model posits that adversariality, but I don’t think that I’m invested in “my side” of the argument winning here, FWTIW. This currently seems like the most plausible high level summary of the situation, given my level of context.
Is there a version of this comment that would regard as better?
Yes sorry Eli, I meant to write out a more fully fleshed out response but unfortunately it got stuck in drafts.
The tl;dr is that I feel this perspective is singling out Sam Altman as some uniquely machiavellian actor in a way I find naive /misleading and ultimately maybe unhelpful.
I think in general im skeptical of the intense focus on individuals & individual tech companies that LW/EA has develloped recently. Frankly, it feels more rooted in savannah-brained tribalism & human interest than a evenkeeled analysis of what factors are actually important, neglected and tractable.
Frankly, it feels more rooted in savannah-brained tribalism & human interest than a evenkeeled analysis of what factors are actually important, neglected and tractable.
Um, I’m not attempting to do cause prioritization or action-planning in the above comment. More like sense-making. Before I move on to the question of what should we do, I want to have an accurate model of the social dynamics in the space.
(That said, it doesn’t seem a foregone conclusion that there are actionable things to do, that will come out of this analysis. If the above story is true, I should make some kind of update about the strategies that EAs adopted with regards to OpenAI in the late 2010s. Insofar as they were mistakes, I don’t want to repeat them.)
It might turn out to be right that the above story is “naive /misleading and ultimately maybe unhelpful”. I’m sure not an expert at understanding these dynamics. But just saying that it’s naive or that it seems rooted in tribalism doesn’t help me or others get a better model.
If it’s misleading, how is it misleading? (And is misleading different than “false”? Are you like “yeah this is technically correct, but it neglects key details”?)
Admittedly, you did label it as a tl;dr, and I did prompt you to elaborate on a react. So maybe it’s unfair of me to request even further elaboration.
I haven’t perceived the degree of focus as intense, and if I had I might be tempted to level similar criticism. But I think current people/companies do clearly matter some, so warrant some focus. For example:
I think it’s plausible that governments will be inclined to regulate AI companies more like “tech startups” than “private citizens building WMDs,” the more those companies strike them as “responsible,” earnestly trying their best, etc. In which case, it seems plausibly helpful to propagate information about how hard they are in fact trying, and how good their best is.
So far, I think many researchers who care non-trivially about alignment—and who might have been capable of helping, in nearby worlds—have for similar reasons been persuaded to join whatever AI company currently has the most safetywashed brand instead. This used to be OpenAI, is now Anthropic, and may be some other company in the future, but it seems useful to me to discuss the details of current examples regardless, in the hope that e.g. alignment discourse becomes better calibrated about how much to expect such hopes will yield.
There may exist some worlds where it’s possible to get alignment right, yet also possible not to, depending on the choices of the people involved. For example, you might imagine that good enough solutions—with low enough alignment taxes—do eventually exist, but that not all AI companies would even take the time to implement those.
Alternatively, you might imagine that some people who come to control powerful AI truly don’t care whether humanity survives, or are even explicitly trying to destroy it. I think such people are fairly common—both in the general population (relevant if e.g. powerful AI is open sourced), and also among folks currently involved with AI (e.g. Sutton, Page, Schmidhuber). Which seems useful to discuss, since e.g. one constraint on our survival is that those who actively wish to kill everyone somehow remain unable to do so.
I definitely understand the skepticism of intense focus on individuals/individual tech companies, but also, these are the groups trying to build the most consequential technology in all of history, so it’s natural that tech companies get the focus here.
In a private slack someone extended credit to Sam Altman for putting EAs on the on the OpenAI board originally, especially that this turned out to be pretty risky / costly for him.
I responded:
It seems to me that there were AI safety people on the board at all is fully explainable by strategic moves from an earlier phase of the game.
Namely, OpenAI traded a boardseat for OpenPhil grant money, and more importantly, OpenPhil endorsement, which translated into talent sourcing and effectively defused what might have been vocal denouncement from one of the major intellectually influential hubs of the world.
No one knows how counterfactual history might have developed, but it doesn’t seem unreasonable to think that there is an external world in which the EA culture successfully created a narrative that groups trying to build AGI were bad and defecting.
He’s the master at this game and not me, but I would bet at even odds that Sam was actively tracking EA as a potential social threat that could dampen OpenAI’s narrative flywheel.
I don’t know that OpenPhil’s grant alone was sufficient to switch from the “EAs vocally decry OpenAI as making the world worse” equilibrium to a “largely (but not universally) thinking that OpenAI is bad in private, but mostly staying silent in public + going to work at OpenAI” equilibrium. But I think it was a major component. OpenPhil’s cooperation bought moral legitimacy for OpenAI amongst EAs.
In retrospect, it looks like OpenAI successfully bought out the EAs through OpenPhil, to a lesser extent through people like Paul.
And Ilya in particular was a founder and one of the core technical leads. It makes sense for him to be a board member, and my understanding (someone correct me) is that he grew to think that safety was more important over time, rather than starting out as an “AI safety person”.
And even so, the rumor is that the thing that triggered the Coup is that Sam maneuvered to get Helen removed. I highly doubt that Sam planned for a situation where he was removed as CEO, and then did some crazy jujitsu move with the whole company where actually he ends up firing the board instead. But if you just zoom out and look at what actually played out, he clearly came out ahead, with control consolidated. Which is the outcome that he was maybe steering towards all along?
So my first pass summary of the situation is that when OpenAI was small and of only medium fame and social power, Sam maneuvered to get the cooperation of EAs, because that defused a major narrative threat, and bought the company moral legitimacy (when that legitimacy was more uncertain). Then after ChatGPT and GPT-4, when OpenAI is rich and famous and has more narrative power than the EAs, Sam moves to remove the people that he made those prestige-trades with in the earlier phase, since he no longer needs their support, and doesn’t have any reason for them to have power over the now-force-to-be-reckoned-with company.
Granted, I’m far from all of this and don’t have confidence about any of these political games. But it seems wrong to me to give Sam points for putting “AI safety people” on the board.
Note that at time of donation, Altman was co-chair of the board but 2 years away from becoming CEO.
More cynical take based on the Musk/Altman emails: Altman was expecting Musk to be CEO. He set up a governance structure which would effectively be able to dethrone Musk, with him as the obvious successor, and was happy to staff the board with ideological people who might well take issue with something Musk did down the line to give him a shot at the throne.
Musk walked away, and it would’ve been too weird to change his mind on the governance structure. Altman thought this trap wouldn’t fire with high enough probability to disarm it at any time before it did.
I don’t know whether the dates line up to dis-confirm this, but I could see this kind of 5d chess move happening. Though maybe normal power and incentive psychological things are sufficient.
@Alexander Gietelink Oldenziel, you put a soldier mindset react on this (and also my earlier, similar, comment this week).
What makes you think so?
Definitely this model posits that adversariality, but I don’t think that I’m invested in “my side” of the argument winning here, FWTIW. This currently seems like the most plausible high level summary of the situation, given my level of context.
Is there a version of this comment that would regard as better?
Yes sorry Eli, I meant to write out a more fully fleshed out response but unfortunately it got stuck in drafts.
The tl;dr is that I feel this perspective is singling out Sam Altman as some uniquely machiavellian actor in a way I find naive /misleading and ultimately maybe unhelpful.
I think in general im skeptical of the intense focus on individuals & individual tech companies that LW/EA has develloped recently. Frankly, it feels more rooted in savannah-brained tribalism & human interest than a evenkeeled analysis of what factors are actually important, neglected and tractable.
Um, I’m not attempting to do cause prioritization or action-planning in the above comment. More like sense-making. Before I move on to the question of what should we do, I want to have an accurate model of the social dynamics in the space.
(That said, it doesn’t seem a foregone conclusion that there are actionable things to do, that will come out of this analysis. If the above story is true, I should make some kind of update about the strategies that EAs adopted with regards to OpenAI in the late 2010s. Insofar as they were mistakes, I don’t want to repeat them.)
It might turn out to be right that the above story is “naive /misleading and ultimately maybe unhelpful”. I’m sure not an expert at understanding these dynamics. But just saying that it’s naive or that it seems rooted in tribalism doesn’t help me or others get a better model.
If it’s misleading, how is it misleading? (And is misleading different than “false”? Are you like “yeah this is technically correct, but it neglects key details”?)
Admittedly, you did label it as a tl;dr, and I did prompt you to elaborate on a react. So maybe it’s unfair of me to request even further elaboration.
yeahh i’m afraid I have too many other obligations right now to give a elaboration that does it justice.
otoh i’m in the Bay and we should definitely catch up sometime!
Fair enough!
Sounds good.
I haven’t perceived the degree of focus as intense, and if I had I might be tempted to level similar criticism. But I think current people/companies do clearly matter some, so warrant some focus. For example:
I think it’s plausible that governments will be inclined to regulate AI companies more like “tech startups” than “private citizens building WMDs,” the more those companies strike them as “responsible,” earnestly trying their best, etc. In which case, it seems plausibly helpful to propagate information about how hard they are in fact trying, and how good their best is.
So far, I think many researchers who care non-trivially about alignment—and who might have been capable of helping, in nearby worlds—have for similar reasons been persuaded to join whatever AI company currently has the most safetywashed brand instead. This used to be OpenAI, is now Anthropic, and may be some other company in the future, but it seems useful to me to discuss the details of current examples regardless, in the hope that e.g. alignment discourse becomes better calibrated about how much to expect such hopes will yield.
There may exist some worlds where it’s possible to get alignment right, yet also possible not to, depending on the choices of the people involved. For example, you might imagine that good enough solutions—with low enough alignment taxes—do eventually exist, but that not all AI companies would even take the time to implement those.
Alternatively, you might imagine that some people who come to control powerful AI truly don’t care whether humanity survives, or are even explicitly trying to destroy it. I think such people are fairly common—both in the general population (relevant if e.g. powerful AI is open sourced), and also among folks currently involved with AI (e.g. Sutton, Page, Schmidhuber). Which seems useful to discuss, since e.g. one constraint on our survival is that those who actively wish to kill everyone somehow remain unable to do so.
I definitely understand the skepticism of intense focus on individuals/individual tech companies, but also, these are the groups trying to build the most consequential technology in all of history, so it’s natural that tech companies get the focus here.
*got paid to remove them as a social threat