We were especially alarmed to notice that the list contains at least 12 former employees currently working on AI policy, and 6 working on safety evaluations. This includes some in leadership positions, for example:
I don’t really follow this reasoning. If anything, playing a leadership role in AI policy or safety evaluations will usually give you an additional reason not to publicly disparage AI companies, to avoid being seen as partisan, making being subject to such an agreement less of an issue. I would be pretty surprised if such people subject to these agreements felt particularly constrained in what they could say as part of their official duties, although if I am wrong about this then it does seem like quite a concerning thing to have happened. The obvious exception to this is if the role involves unofficial public commentary about labs, but it’s not obvious to me that this has been a big part of the role of any of the people on your list, and even then, they may not have felt especially constrained, depending on the individual. It’s also worth noting that several of these roles require the holder to give up or donate lab equity to avoid any conflict of interest, regardless of any non-disparagement agreements.
I suspect the crux may be our differing interpretations of the agreement. I’m not sure where your interpretation that it prohibits “taking any actions which might make the company less valuable” comes from, maybe you could highlight the part of the agreement you are basing that on.
When you have a role in policy or safety, it may usually be a good idea not to voice strong opinions on any given company. If you nevertheless feel compelled to do so by circumstances, it’s a big deal if you have personal incentives against that—especially if they’re not disclosed.
Yeah I agree with this, and my original comment comes across too strongly upon re-reading. I wanted to point out some counter-considerations, but the comment ended up unbalanced. My overall view is:
It was highly inappropriate for the company to have been issuing these agreements so widely, especially using such aggressive tactics and without allowing disclosure of the agreement, given the technology that it is developing.
The more high-profile and credible a person is, the more damaging it is for this person to have been subject to the agreement.
Nevertheless, it is a mistake to think of potential “disparagement” as part of the job duties of most of the people mentioned, and the post appears to wildly misinterpret the meaning of this term as “taking any actions which might make the company less valuable”.
Ultimately, it would have looked extremely bad for the company to enforce one of these agreements, so the primary effect of the contract comes down to how individuals felt that it constrained their behavior. We don’t have great visibility into this. It’s possible that some of these people felt quite constrained, and it’s also possible that some of these people weren’t even aware of the non-disparagement clause because they didn’t notice it when they signed.
Thankfully, most of this is now moot as the company has retracted the contract. I should emphasize that there may remain some legal ambiguity and additional avenues for retaliation, but I am optimistic that these will be cleaned up in the near future. There will still be non-disparagement agreements in place in cases where “the non-disparagement provision was mutual” (in the words of the company), but my strong guess is that this refers only to the original Anthropic departures and perhaps a handful of other individuals who were high up at the company.
It remains important for people to disclose their financial interest in the company when appropriate, or in some cases give up this interest to avoid a conflict of interest.
Note: I have a financial interest in the company and was subject to one of these agreements until recently.
Thankfully, most of this is now moot as the company has retracted the contract.
I don’t think any of this is moot, since the thing that is IMO most concerning is people signing these contracts, then going into policy or leadership positions and not disclosing that they signed those contracts. Those things happened in the past and are real breaches of trust.
I imagine many of the people going into leadership positions were prepared to ignore the contract, or maybe even forgot about the nondisparagement clause altogether. The clause is also open to more avenues of legal attack if it’s enforced against someone who takes another position which requires disparagement (e.g. if it’s argued to be a restriction on engaging in business). And if any individual involved divested themselves of equity before taking up another position, there would be fewer ways for the company to retaliate against them. I don’t think it’s fair to view this as a serious breach of trust on behalf of any individual, without clear evidence that it impacted their decisions or communication.
But it is fair to view the situation overall as concerning that it could happen with nobody noticing, and try to design defenses to prevent this or similar things happening in the future, e.g. some clear statement around not having any conflict of interest including legal obligations for people going into independent leadership positions, as well as a consistent divestment policy (though that creates its own wierd incentives).
I imagine many of the people going into leadership positions were prepared to ignore the contract, or maybe even forgot about the nondisparagement clause
I could imagine it being the case that people are prepared to ignore the contract. But unless they publicly state that, it wouldn’t ameliorate my concerns—otherwise how is anyone supposed to trust they will?
The clause is also open to more avenues of legal attack if it’s enforced against someone who takes another position which requires disparagement (e.g. if it’s argued to be a restriction on engaging in business).
That seems plausible, but even if this does increase the likelihood that they’d win a legal battle, legal battles still pose huge risk and cost. This still seems like a meaningful deterrent.
I don’t think it’s fair to view this as a serious breach of trust on behalf of any individual, without clear evidence that it impacted their decisions or communication.
But how could we even get this evidence? If they’re bound to the agreement their actions just look like an absence of saying disparaging things about OpenAI, or of otherwise damaging their finances or reputation. And it’s hard to tell, from the outside, whether this is a reflection of an obligation, or of a genuine stance. Positions of public responsibility require public trust, and the public doesn’t have access to the inner workings of these people’s minds. So I think it’s reasonable, upon finding out that someone has a huge and previously-undisclosed conflict of interest, to assume that might be influencing their behavior.
Evidence could look like 1. Someone was in a position where they had to make a judgement about OpenAI and was in a position of trust 2. They said something bland and inoffensive about OpenAI 3. Later, independently you find that they likely would have known about something bad that they likely weren’t saying because of the nondisparagement agreement (instead of ordinary confidentially agreements).
This requires some model of “this specific statement was influenced by the agreement” instead of just “you never said anything bad about OpenAI because you never gave opinions on OpenAI”.
I think one should require this kind of positive evidence before calling it a “serious breach of trust”, but people can make their own judgement about where that bar should be.
I agree, but I also doubt the contract even has been widely retracted. Why do you think it has, Jacob? Quite few people have reported being released so far.
(This is Kelsey Piper). I am quite confident the contract has been widely retracted. The overwhelming majority of people who received an email did not make an immediate public comment. I am unaware of any people who signed the agreement after 2019 and did not receive the email, outside cases where the nondisparagement agreement was mutual (which includes Sutskever and likely also Anthropic leadership). In every case I am aware of, people who signed before 2019 did not reliably receive an email but were reliably able to get released if they emailed OpenAI HR.
If you signed such an agreement and have not been released, you can of course contact me on Signal: 303 261 2769.
I am quite confident the contract has been widely retracted.
Can you share your reasons for thinking this? Given that people who remain bound can’t say so, I feel hesitant to conclude that people aren’t without clear evidence.
I am unaware of any people who signed the agreement after 2019 and did not receive the email, outside cases where the nondisparagement agreement was mutual (which includes Sutskever and likely also Anthropic leadership).
Excepting Jack Clark (who works for Anthropic) and Remco Zwetsloot (who left in 2018), I would think all the policy leadership folks listed above meet these criteria, yet none have reported being released. Would you guess that they have been?
I have been in touch with around a half dozen former OpenAI employees who I spoke to before former employees were released and all of them later informed me they were released, and they were not in any identifiable reference class such that I’d expect OpenAI would have been able to selectively release them while not releasing most people. I have further been in touch with many other former employees since they were released who confirmed this. I have not heard from anyone who wasn’t released, and I think it is reasonably likely I would have heard from them anonymously on Signal. Also, not releasing a bunch of people after saying they would seems like an enormously unpopular, hard to keep secret, and not very advantageous move for OpenAI, which is already taking a lot of flak for this. I also have a model of how people choose whether or not to make public statements where it’s extremely unsurprising most people would not choose to do so.
I would indeed guess that all of the people you listed have been released if they were even subject to such agreements in the first place, which I do not know (and the fact Geoffrey Irving was not offered such an agreement is some basis to think they were not uniformly imposed during some of the relevant time periods, imo.)
I also have a model of how people choose whether or not to make public statements where it’s extremely unsurprising most people would not choose to do so.
I agree it’s unsurprising that few rank-and-file employees would make statements, but I am surprised by the silence from those in policy/evals roles. From my perspective, active non-disparagement obligations seem clearly disqualifying for most such roles, so I’d think they’d want to clarify.
It sounds from this back and forth like we should assume that Anthropic leadership who left from OAI (so Dario and Daniela Amodei, Jack Clark, Sam McCandlish, others?) are still under NDA because it was probably mutual. Does that sound right to others?
I have not heard from anyone who wasn’t released, and I think it is reasonably likely I would have heard from them anonymously on Signal. Also, not releasing a bunch of people after saying they would seems like an enormously unpopular, hard to keep secret, and not very advantageous move for OpenAI, which is already taking a lot of flak for this.
I’m not necessarily imagining that OpenAI failed to release a bunch of people, although that still seems possible to me. I’m more concerned that they haven’t released many key people, and while I agree that you might have received an anonymous Signal message to that effect if it were true, I still feel alarmed that many of these people haven’t publicly stated otherwise.
I also have a model of how people choose whether or not to make public statements where it’s extremely unsurprising most people would not choose to do so.
I do find this surprising. Many people are aware of who former OpenAI employees are, and hence are aware of who was (or is) bound by this agreement. At the very least, if I were in this position, I would want people to know that I was no longer bound. And it does seem strange to me, if the contract has been widely retracted, that so few prominent people have confirmed being released.
It also seems pretty important to figure out who is under mutual non-disparagement agreements with OpenAI, which would still (imo) pose a problem if it applied to anyone in safety evaluations or policy positions.
We’re removing nondisparagement clauses from our standard departure paperwork, and we’re releasing former employees from existing nondisparagement obligations unless the nondisparagement provision was mutual. We’ll communicate this message to former employees.
They have communicated this to me and I believe I was in the same category as most former employees.
I think the main reasons so few people have mentioned this are:
As I mentioned, there is still some legal ambiguity and additional avenues for retaliation
Some people are taking their time over what they want to say
Most people don’t want to publicly associate themselves with a controversial situation
Most people aren’t inclined to disparage their former employer anyway, and so they may not think of their own situation as that big of a deal
the post appears to wildly misinterpret the meaning of this term as “taking any actions which might make the company less valuable”
I’m not a lawyer, and I may be misinterpreting the non-interference provision—certainly I’m willing to update the post if so! But upon further googling, my current understanding is still that in contracts, “interference” typically means “anything that disrupts, damages or impairs business.”
And the provision in the OpenAI offboarding agreement is written so broadly—”Employee agrees not to interfere with OpenAI’s relationship with current or prospective employees, current or previous founders, portfolio companies, suppliers, vendors or investors”—that I assumed it was meant to encompass essentially all business impact, including e.g. the company’s valuation.
The four corners of the agreement seem to define ‘disparagement’ broadly, so one might reasonably fear (e.g.) “First author on an eval especially critical of OpenAI versus its competitors”, or “Policy document highly critical of OpenAI leadership decisions” might ‘count’.
Given Altman’s/OpenAI’s vindictiveness and duplicity, and the previous ‘safeguards’ (from their perspective) which give them all the cards in terms of folks being able to realise the value of their equity, “They will screw me out of a lot of money if I do something they really don’t like (regardless of whether it ‘counts’ per the non-disparagement agreement)” seems a credible fear.
It appears Altman tried to get Toner kicked off the board for being critical of OpenAI in a policy piece, after all.
This is indeed moot for roles which require equity to be surrendered anyway. I’d guess most roles outside government (and maybe some within it) do not have such requirements. A conflict of interest roughly along the lines of the first two points makes impartial performance difficult, and credible impartial performance impossible (i.e. even if indeed Alice can truthfully swear “My being subject to such an agreement has never influenced my work in AI policy”, reasonable third parties would be unwise to believe her).
The ‘non-disclosure of non-disparagement’ makes this worse, as it interferes with this conflict of interest being fully disclosed. “Alice has a bunch of OpenAI equity” is one thing, “Alice has a bunch of OpenAI equity, and has agreed to be beholden to them in various ways to keep it” is another. We would want to know the latter to critically appraise Alice’s work whenever it is relevant to OpenAI’s interests (and I would guess a lot of policy/eval/reg/etc. would be sufficiently relevant that we’d like to contemplate whether Alice’s commitments colour her position). Yet Alice has also promised to keep these extra relevant details secret.
I don’t really follow this reasoning. If anything, playing a leadership role in AI policy or safety evaluations will usually give you an additional reason not to publicly disparage AI companies, to avoid being seen as partisan, making being subject to such an agreement less of an issue. I would be pretty surprised if such people subject to these agreements felt particularly constrained in what they could say as part of their official duties, although if I am wrong about this then it does seem like quite a concerning thing to have happened. The obvious exception to this is if the role involves unofficial public commentary about labs, but it’s not obvious to me that this has been a big part of the role of any of the people on your list, and even then, they may not have felt especially constrained, depending on the individual. It’s also worth noting that several of these roles require the holder to give up or donate lab equity to avoid any conflict of interest, regardless of any non-disparagement agreements.
I suspect the crux may be our differing interpretations of the agreement. I’m not sure where your interpretation that it prohibits “taking any actions which might make the company less valuable” comes from, maybe you could highlight the part of the agreement you are basing that on.
When you have a role in policy or safety, it may usually be a good idea not to voice strong opinions on any given company. If you nevertheless feel compelled to do so by circumstances, it’s a big deal if you have personal incentives against that—especially if they’re not disclosed.
Yeah I agree with this, and my original comment comes across too strongly upon re-reading. I wanted to point out some counter-considerations, but the comment ended up unbalanced. My overall view is:
It was highly inappropriate for the company to have been issuing these agreements so widely, especially using such aggressive tactics and without allowing disclosure of the agreement, given the technology that it is developing.
The more high-profile and credible a person is, the more damaging it is for this person to have been subject to the agreement.
Nevertheless, it is a mistake to think of potential “disparagement” as part of the job duties of most of the people mentioned, and the post appears to wildly misinterpret the meaning of this term as “taking any actions which might make the company less valuable”.
Ultimately, it would have looked extremely bad for the company to enforce one of these agreements, so the primary effect of the contract comes down to how individuals felt that it constrained their behavior. We don’t have great visibility into this. It’s possible that some of these people felt quite constrained, and it’s also possible that some of these people weren’t even aware of the non-disparagement clause because they didn’t notice it when they signed.
Thankfully, most of this is now moot as the company has retracted the contract. I should emphasize that there may remain some legal ambiguity and additional avenues for retaliation, but I am optimistic that these will be cleaned up in the near future. There will still be non-disparagement agreements in place in cases where “the non-disparagement provision was mutual” (in the words of the company), but my strong guess is that this refers only to the original Anthropic departures and perhaps a handful of other individuals who were high up at the company.
It remains important for people to disclose their financial interest in the company when appropriate, or in some cases give up this interest to avoid a conflict of interest.
Note: I have a financial interest in the company and was subject to one of these agreements until recently.
I don’t think any of this is moot, since the thing that is IMO most concerning is people signing these contracts, then going into policy or leadership positions and not disclosing that they signed those contracts. Those things happened in the past and are real breaches of trust.
I imagine many of the people going into leadership positions were prepared to ignore the contract, or maybe even forgot about the nondisparagement clause altogether. The clause is also open to more avenues of legal attack if it’s enforced against someone who takes another position which requires disparagement (e.g. if it’s argued to be a restriction on engaging in business). And if any individual involved divested themselves of equity before taking up another position, there would be fewer ways for the company to retaliate against them. I don’t think it’s fair to view this as a serious breach of trust on behalf of any individual, without clear evidence that it impacted their decisions or communication.
But it is fair to view the situation overall as concerning that it could happen with nobody noticing, and try to design defenses to prevent this or similar things happening in the future, e.g. some clear statement around not having any conflict of interest including legal obligations for people going into independent leadership positions, as well as a consistent divestment policy (though that creates its own wierd incentives).
I could imagine it being the case that people are prepared to ignore the contract. But unless they publicly state that, it wouldn’t ameliorate my concerns—otherwise how is anyone supposed to trust they will?
That seems plausible, but even if this does increase the likelihood that they’d win a legal battle, legal battles still pose huge risk and cost. This still seems like a meaningful deterrent.
But how could we even get this evidence? If they’re bound to the agreement their actions just look like an absence of saying disparaging things about OpenAI, or of otherwise damaging their finances or reputation. And it’s hard to tell, from the outside, whether this is a reflection of an obligation, or of a genuine stance. Positions of public responsibility require public trust, and the public doesn’t have access to the inner workings of these people’s minds. So I think it’s reasonable, upon finding out that someone has a huge and previously-undisclosed conflict of interest, to assume that might be influencing their behavior.
Evidence could look like 1. Someone was in a position where they had to make a judgement about OpenAI and was in a position of trust 2. They said something bland and inoffensive about OpenAI 3. Later, independently you find that they likely would have known about something bad that they likely weren’t saying because of the nondisparagement agreement (instead of ordinary confidentially agreements).
This requires some model of “this specific statement was influenced by the agreement” instead of just “you never said anything bad about OpenAI because you never gave opinions on OpenAI”.
I think one should require this kind of positive evidence before calling it a “serious breach of trust”, but people can make their own judgement about where that bar should be.
I agree, but I also doubt the contract even has been widely retracted. Why do you think it has, Jacob? Quite few people have reported being released so far.
(This is Kelsey Piper). I am quite confident the contract has been widely retracted. The overwhelming majority of people who received an email did not make an immediate public comment. I am unaware of any people who signed the agreement after 2019 and did not receive the email, outside cases where the nondisparagement agreement was mutual (which includes Sutskever and likely also Anthropic leadership). In every case I am aware of, people who signed before 2019 did not reliably receive an email but were reliably able to get released if they emailed OpenAI HR.
If you signed such an agreement and have not been released, you can of course contact me on Signal: 303 261 2769.
Can you share your reasons for thinking this? Given that people who remain bound can’t say so, I feel hesitant to conclude that people aren’t without clear evidence.
Excepting Jack Clark (who works for Anthropic) and Remco Zwetsloot (who left in 2018), I would think all the policy leadership folks listed above meet these criteria, yet none have reported being released. Would you guess that they have been?
I have been in touch with around a half dozen former OpenAI employees who I spoke to before former employees were released and all of them later informed me they were released, and they were not in any identifiable reference class such that I’d expect OpenAI would have been able to selectively release them while not releasing most people. I have further been in touch with many other former employees since they were released who confirmed this. I have not heard from anyone who wasn’t released, and I think it is reasonably likely I would have heard from them anonymously on Signal. Also, not releasing a bunch of people after saying they would seems like an enormously unpopular, hard to keep secret, and not very advantageous move for OpenAI, which is already taking a lot of flak for this. I also have a model of how people choose whether or not to make public statements where it’s extremely unsurprising most people would not choose to do so.
I would indeed guess that all of the people you listed have been released if they were even subject to such agreements in the first place, which I do not know (and the fact Geoffrey Irving was not offered such an agreement is some basis to think they were not uniformly imposed during some of the relevant time periods, imo.)
Thanks, that’s helpful context.
I agree it’s unsurprising that few rank-and-file employees would make statements, but I am surprised by the silence from those in policy/evals roles. From my perspective, active non-disparagement obligations seem clearly disqualifying for most such roles, so I’d think they’d want to clarify.
It sounds from this back and forth like we should assume that Anthropic leadership who left from OAI (so Dario and Daniela Amodei, Jack Clark, Sam McCandlish, others?) are still under NDA because it was probably mutual. Does that sound right to others?
I’m not necessarily imagining that OpenAI failed to release a bunch of people, although that still seems possible to me. I’m more concerned that they haven’t released many key people, and while I agree that you might have received an anonymous Signal message to that effect if it were true, I still feel alarmed that many of these people haven’t publicly stated otherwise.
I do find this surprising. Many people are aware of who former OpenAI employees are, and hence are aware of who was (or is) bound by this agreement. At the very least, if I were in this position, I would want people to know that I was no longer bound. And it does seem strange to me, if the contract has been widely retracted, that so few prominent people have confirmed being released.
It also seems pretty important to figure out who is under mutual non-disparagement agreements with OpenAI, which would still (imo) pose a problem if it applied to anyone in safety evaluations or policy positions.
See the statement from OpenAI in this article:
They have communicated this to me and I believe I was in the same category as most former employees.
I think the main reasons so few people have mentioned this are:
As I mentioned, there is still some legal ambiguity and additional avenues for retaliation
Some people are taking their time over what they want to say
Most people don’t want to publicly associate themselves with a controversial situation
Most people aren’t inclined to disparage their former employer anyway, and so they may not think of their own situation as that big of a deal
I’m not a lawyer, and I may be misinterpreting the non-interference provision—certainly I’m willing to update the post if so! But upon further googling, my current understanding is still that in contracts, “interference” typically means “anything that disrupts, damages or impairs business.”
And the provision in the OpenAI offboarding agreement is written so broadly—”Employee agrees not to interfere with OpenAI’s relationship with current or prospective employees, current or previous founders, portfolio companies, suppliers, vendors or investors”—that I assumed it was meant to encompass essentially all business impact, including e.g. the company’s valuation.
I see the concerns as these:
The four corners of the agreement seem to define ‘disparagement’ broadly, so one might reasonably fear (e.g.) “First author on an eval especially critical of OpenAI versus its competitors”, or “Policy document highly critical of OpenAI leadership decisions” might ‘count’.
Given Altman’s/OpenAI’s vindictiveness and duplicity, and the previous ‘safeguards’ (from their perspective) which give them all the cards in terms of folks being able to realise the value of their equity, “They will screw me out of a lot of money if I do something they really don’t like (regardless of whether it ‘counts’ per the non-disparagement agreement)” seems a credible fear.
It appears Altman tried to get Toner kicked off the board for being critical of OpenAI in a policy piece, after all.
This is indeed moot for roles which require equity to be surrendered anyway. I’d guess most roles outside government (and maybe some within it) do not have such requirements. A conflict of interest roughly along the lines of the first two points makes impartial performance difficult, and credible impartial performance impossible (i.e. even if indeed Alice can truthfully swear “My being subject to such an agreement has never influenced my work in AI policy”, reasonable third parties would be unwise to believe her).
The ‘non-disclosure of non-disparagement’ makes this worse, as it interferes with this conflict of interest being fully disclosed. “Alice has a bunch of OpenAI equity” is one thing, “Alice has a bunch of OpenAI equity, and has agreed to be beholden to them in various ways to keep it” is another. We would want to know the latter to critically appraise Alice’s work whenever it is relevant to OpenAI’s interests (and I would guess a lot of policy/eval/reg/etc. would be sufficiently relevant that we’d like to contemplate whether Alice’s commitments colour her position). Yet Alice has also promised to keep these extra relevant details secret.