Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI"

johnswentworth

I’ve seen/heard a bunch of people in the LW-o-sphere saying that the OpenAI corporate drama this past weekend was clearly bad. And I’m not really sure why people think that? To me, seems like a pretty clearly positive outcome overall.

I’m curious why in the world people are unhappy about it (people in the LW-sphere, that is, obviously I can see why e.g. AI accelerationists would be unhappy about it). And I also want to lay out my models.

johnswentworth

Here’s the high-gloss version of my take. The main outcomes are:

The leadership who were relatively most focused on racing to AGI and least focused on safety are moving from OpenAI to Microsoft. Lots of employees who are relatively more interested in racing to AGI than in safety will probably follow.
Microsoft is the sort of corporate bureaucracy where dynamic orgs/founders/researchers go to die. My median expectation is that whatever former OpenAI group ends up there will be far less productive than they were at OpenAI.
It’s an open question whether OpenAI will stick around at all.
- Insofar as they do, they’re much less likely to push state-of-the-art in capabilities, and much more likely to focus on safety research.
- Insofar as they shut down, the main net result will be a bunch of people who were relatively more interested in racing to AGI and less focused on safety moving to Microsoft, which is great.

johnswentworth

My current (probably wrong) best guesses at why other people in the LW-o-sphere are saying this is terrible:

There’s apparently been a lot of EA-hate on twitter as a result. I personally expect this to matter very little, if at all, in the long run, but I’d expect it to be extremely disproportionately salient to rationalists/EAs/alignment folk.
OpenAI was an organization with a lot of AGI-accelerationists, and maybe people thought OpenAI was steering those accelerationist impulses in more safety-friendly directions, whereas Microsoft won’t?
Obviously the board executed things relatively poorly. They should have shared their reasons/excuses for the firing. (For some reason, in politics/corporate politics, people try to be secretive all the time and this seems-to-me to be very stupid in like 80+% of cases, including this one.) I don’t think that mistake will actually matter that much in the long term, but I can see why people focused on it would end up with a sort of general negative valence around the board’s actions.

Ruby

(Quick caveat that I think this will question will be easier to judge once more info comes out. That said, I think that thinking about it is useful even now for thinking about and sharing relevant observations and considerations.)

Ruby

I think what happens to Sam and others who end up at Microsoft is a pretty big crux here. If I thought that indeed those going to Microsoft would get caught in bureaucracy and not accomplish as much, and also those staying behind wouldn’t pursue as much, that might make the whole thing good for x-risk.

I’m not overwhelmingly confident here, but my impression is Sama might be competent enough to cut through the bureaucracy and get a lot done notwithstanding, and more than that, by being competent and getting AI, ends up running much of Microsoft. And being there just gives him a lot more resources with less effort than the whole invest-in-OpenAI cycle, and with less restrictions than he had at OpenAI.

One question is how independently he could operate. Nadella mentioned LinkedIn and Github (?) operating quite independently within Microsoft. Also I think Microsoft will feel they have to “be nice” to Sama as he is likely is their key to AI dominance. He clearly commands a following and could go elsewhere, and I doubt he’d put up bureaucracy that slowed him down.

An unanswered question so far is whether the board has acted with integrity (/cooperativeness). If the board is both judged to represent the most AI-concerned cluster (of which we are part) and they acted in a pretty bad way, then that itself could be really terrible for our (the cluster’s) ability to cooperate or move things in the broader world in good directions. Like, possibly a lot worse than any association with SBF.

johnswentworth

An unanswered question so far is whether the board has acted with integrity (/cooperativeness). If the board is both judged to represent the most AI-concerned cluster (of which we are part) and they acted in a pretty bad way, then that itself could be really terrible for our (the cluster’s) ability to cooperate or move things in the broader world in good directions. Like, possibly a lot worse than any association with SBF.

I remember shortly after the Sam-Bankman-Fried-pocalypse lots of rats/EAs/etc were all very preoccupied with how this would tarnish the reputation of EA etc. And a year later, I think it… just wasn’t that bad? Like, we remain a relatively-high-integrity group of people overall, and in the long run PR equilibrated to reflect that underlying reality. And yes, there are some bad apples, and PR equilibrated to reflect that reality as well, and that’s good, insofar as I worry about PR at all (which is not much) I mostly just want it to be accurate.

I think this is very much the sort of thing which will seem massively more salient to EAs/alignment researchers/etc than to other people, far out of proportion to how much it actually matters, and we need to adjust for that.

johnswentworth

(Side note: I also want to say here something like “integrity matters a lot more than PR”, and the events of the past weekend seem like a PR problem much more than an integrity problem.)

Ruby

And a year later, I think it… just wasn’t that bad?

Disagree, or at least, I don’t think we know yet.

I agree that we’re not continuing to hear hate online and the group continues and gets new members and life seems to go on. But also, in a world where this weekends events hadn’t happened (I think they might dwarf what happened, or likely compound them), I think it’s likely the SBF association would influence key events and ability to operate in the world.

There are what get called “stand by the Levers of Power” strategies and I don’t know if they’re good, but things like getting into positions within companies and governments that let you push for better AI outcomes, and I do think SBF might have made that a lot harder.

If Jason Matheny had been seeking positions (Whitehouse, RAND, etc) following SBF, I think being a known EA might have been a real liability. And that’s not blatantly visible a year later, but I would bet it’s an association we have not entirely lost. And I think the same could be said of this weekends events to the extent that were came from EA-typical motivations. That this makes it a lot harder for our cluster to be trusted to be cooperative/good faith/competent partners in things.

johnswentworth

I’m not incredibly confident here, but my impression is Sama might be competent enouhg to cut through the bureaucracy and get a lot done notwithstanding, and more than that, by being competent and getting AI, ends up running much of Microsoft. And being there just gives him a lot more resources with less effort than the whole invest-in-OpenAI cycle, and with less restrictions than he had at OpenAI.

One question is how independently he could operate. Nadella mentioned LinkedIn and Github (?) operating quite independently within Microsoft. Also I think Microsoft will feel they have to “be nice” to Sama as he is likely is their key to AI dominance. He clearly commands a following and could go elsewhere, and I doubt he’d put up bureaucracy that slowed him down.

To my knowledge, Sama never spent much time in a big bureaucratic company before? He was at a VC firm and startups.

And on top of that, my not-very-informed-impression-from-a-distance is that he’s more a smile-and-rub-elbows guy than an actual technical manager; I don’t think he was running much day-to-day at OpenAI? Low confidence on that part, though, I have not heard it from a source who would know well.

The other side of this equation has less uncertainty, though. I found it hilarious that Nadella mentioned Linkedin as a success of Microsoft; that product peaked right before it was acquired. Microsoft pumped money out of it, but they sure didn’t improve it.

Ruby

I agree about LinkedIn going downhill and so really does support your stance more than mine. Still, seems there are precedents of acquisitions getting operate independently. Instagram, I think. Pixar. Definitely others where a lot of freedom is maintained.

johnswentworth

And that’s not blatantly visible a year later, but I would bet it’s an association we have not entirely lost. And I think the same could be said of this weekends events to the extent that were came from EA-typical motivations. That this makes it a lot harder for our cluster to be trusted to be cooperative/good faith/competent partners in things.

I basically agree with this as a mechanism, to be clear. I totally think the board made some unforced PR errors and burned some of the EA commons as a result. I just think it’s nowhere near as important as the other effects, and EAs are paying disproportionate attention to it.

Ruby

This seems to be crux: did the board merely make PR errors vs other errors?

I can dive in with my take, but also happy to first get your accounting of the errors.

johnswentworth

Still, seems there are precedents of acquisitions getting operate independently. Instagram, I think. Pixar. Definitely others where a lot of freedom is maintained.

But not so much Microsoft acquisitions. (Also, Instagram is pretty debatable, but I’ll definitely give you Pixar.) Like, the people who want to accelerate AGI are going to end up somewhere, and if I had to pick a place Microsoft would definitely make the short list. The only potentially-better option which springs to mind right now would be a defense contractor: same-or-worse corporate bloat, but all their work would be Officially Secret.

Ruby

I think for most people I’d agree with that. Perhaps I’m buying into much to charisma, but Sam does give me the vibe of Actual Competence which makes it more scary.

johnswentworth

This seems to be crux: did the board merely make PR errors vs other errors?

I’ll note that this is not a crux for me, and I don’t really know why it’s a crux for you? Like, is this really the dominant effect, such that it would change the sign of the net effect on X-risk from AGI? What’s the mechanism by which it ends up mattering that much?

Ruby

I was going to write stuff about integrity, and there’s stuff to that, but the thing that is striking me most right now is that the whole effort seemed very incompetent and naive. And that’s upsetting. I have the fear that my cluster has been revealed to be (I mean, if it’s true, good for it to come out) to not be very astute and you shouldn’t deal with us because we clearly don’t know what we’re doing.

johnswentworth

I have the fear that my cluster has been revealed to be (I mean, if it’s true, good for it to come out) to not be very astute and you shouldn’t deal with us because we clearly don’t know what we’re doing.

First, that sure does sound like the sort of thing which the human brain presents to us as a far larger, more important fact than it actually is. Ingroup losing status? Few things are more prone to distorted perception than that.

But anyway, I repeat my question: what’s the mechanism by which it ends up mattering that much? Like, tell me an overly-specific story with some made-up details in it. (My expectation is that the process of actually telling the story will make it clear that this probably doesn’t matter as much as it feels like it matters.)

johnswentworth

This seems to be crux: did the board merely make PR errors vs other errors?

I can dive in with my take, but also happy to first get your accounting of the errors.

(Now realizing I didn’t actually answer this yet.) I think the board should have publicly shared their reasons for sacking Sama. That’s basically it; that’s the only part which looks-to-me like an obvious unforced mistake.

Ruby

Suppose that johnswentworth and a related cluster of researchers figure out some actually key useful stuff that makes the difference for Alignment. It has large implications for how to go about training your models and evaluating your models (in evals sense), and relatedly kinds of policy that are useful. Something something in natural abstractions that isn’t immediately useful for building more profitable AI, but does shape what your agenda and approaches should look like.

Through the social graph, people reach out to former-OpenAI employees to convince of the importance of the results. Jan Leike is convinced but annoyed because he feels burned, Ilya is also sympathetic, but feels constrained in pushing LW/EA-cluster research ideas among a large population of people who have antagonistic, suspicious feelings of the doomers who tried to fire Sam Altman and didn’t explain why.

People go to Washington and/or UK taskforce (I can pick names if it matters), some people are sympathetic, other fear the association. It doesn’t help that Helen Toner, head of CSET, seemed to be involved in this fiasco. It’s not that no one will talk to us and that no one will entertain the merits of the research, but that most people don’t judge the research on its merits and there isn’t the credibility to adopt it or its implications. A lot of friction arises from interacting with the “doomer cluster”. There’s bleed over from judgment that political judgment is poor to general judgment (including risks of AI at all) is poor.

If the things you want people to do differently are costly, e.g. your safer AI is more expensive, but you are seen as untrustworthy, low-integrity, low-tranparency, low political competence, then I think you’ll have a hard time getting buy in for it.

I can get more specific if that would help.

johnswentworth

Ok, and having written all that out… quantitatively, how much do you think the events of this past weekend increased the chance of something-vaguely-like-that-but-more-general happening, and that making the difference between doom and not-doom?

Ruby

That’s a good question. I don’t think I’m maintaining enough calibration/fine-grained estimates to answer in a non-bullshit way, so I want to decline answering that for now.

My P(Doom) feels like it’s up somewhere between above 1x but less than 2x but would take some work to separate out “I learned info about how the world is” and “actions by the board made things worse”.

johnswentworth

I don’t think I’m maintaining enough calibration/fine-grained estimates to answer in a non-bullshit way, so I want to decline answering that for now.

Yeah, totally fair, that was a pretty large ask on my part.

johnswentworth

Getting back to my own models here: as I see it, (in my median world) the AI lab which was previously arguably-leading the capabilities race is now de-facto out of the capabilities race. Timelines just got that much longer, race dynamics just got much weaker. And I have trouble seeing how the PR blowback from the board’s unforced errors could plausibly be bad enough to outweigh a benefit that huge, in terms of the effect on AI X-risk.

Like, that is one of the best possible things which could realistically happen from corporate governance, in terms of X-risk.

Ruby

I’m not ruling the above out, but seems very non-obvious to me, and I’d at least weakly bet against.

As above, crux of how much less/more AGI progress happens at Microsoft. I viewed OpenAI as racing pretty hard but still containing voices of restraint and some earnest efforts on behalf of at least some (Jan, Ilya ?) to make AGI go well. I imagine Microsoft having even less caution than OpenAI, and more profit focus, and less ability of the more concerned/good models of AI risk to influence things for the better.

There’s the model of Microsoft as slow bureaucracy, but also it’s a 3 trillion dollar company with a lot of cash and correctly judging that AI is the most importance for dominance in the future. If they decide they want to start manufacturing their own chips or whatever, easy to do so. Also natural for Google to feel more threatened by them if Microsoft contains it’s own division. We end up with both Microsoft and Google containing large skilled AI research teams, each with billions of capital to put behind a race.

Really, to think this reduces race dynamics to me seems crazy actually.

johnswentworth

I mean, I don’t really care how much e.g. Facebook AI thinks they’re racing right now. They’re not in the game at this point. That’s where I expect Microsoft to be a year from now (median world). Sure, the snails will think they’re racing, but what does it matter if they’re not going to be the ones in the lead?

(If the other two currently-leading labs slow down enough that Facebook/Microsoft are in the running, then obviously Facebook/Microsoft would become relevant, but that would be a GREAT problem to have relative to current position.)

Google feeling threatened… maaayyyybe, but that feels like it would require a pretty conjunctive path of events for it to actually accelerate things.

johnswentworth

… anyway, we’ve been going a couple hours now, and I think we’ve identified the main cruxes and are hitting diminishing returns. Wanna do wrap-up thoughts? (Or not, we can keep pushing a thread if you want to.)

Ruby

Sounds good, yeah, I think we’ve ferreted out some cruxes. Seems Microsoft productivity is a big one, and how much the remain a leading player. I think they do. I think they keep most of the talent and OpenAI gets to carry on with more resources and less restraint.

johnswentworth

I think this part summarized my main take well:

(in my median world) the AI lab which was previously arguably-leading the capabilities race is now de-facto out of the capabilities race. Timelines just got that much longer, race dynamics just got much weaker. And I have trouble seeing how the PR blowback from the board’s unforced errors could plausibly be bad enough to outweigh a benefit that huge, in terms of the effect on AI X-risk.

johnswentworth

I think we both agree that “how productive are the former OpenAI folks at Microsoft?” is a major crux?

Ruby

The other peace if I think it’s likely a lot of credibility and trust with those expressing AI-doom/risk got burned and will be very hard to replace, in ways that have reduced our ability to nudge things for the better (whether we had the judgment to or were going to successfully otherwise, I’m actually really not sure).

Ruby

I think we both agree that “how productive are the former OpenAI folks at Microsoft?” is a major crux?

Yup, agree on that one. I’d predict something like we continue to see the same or greater rate of GPT editions and capability, etc, out of Microsoft that OpenAI was producing until now.

Ruby

Cheers, I enjoyed this

johnswentworth

[Added 36 hours later:] Well that model sure got downweighted quickly.

Dialogue on the Claim: “OpenAI’s Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI”