It is hard to pinpoint motivation here. If you are a top researcher at a top lab working on alignment and you disagree with something within the company, I see two categories of options you can take to try to fix things
Stay and try to use your position of power to do good. Better that someone who deeply cares about AI risk is in charge than someone who doesn’t
Leave in protest to try to sway public opinion into thinking that your organization is unsafe and that we should not trust it
Jan and Ilya left but haven’t said much about how they lost confidence in OpenAI. I expect we will see them making more damning statements about OpenAI in the future
Or is there a possible motivation I’m missing here?
It seems likely (though not certain) that they signed non-disparagement agreements, so we may not see more damning statements from them even if that’s how they feel. Also, Ilya at least said some positive things in his leaving announcement, so that indicates either that he caved in to pressure (or too high agreeableness towards former co-workers) or that he’s genuinely not particularly worried about the direction of the company and that he left more because of reasons related to his new project.
Someone serious about alignment seeing dangers better do what is save and not be influenced by a non-disparagement agreement. It might lose them some job prospects and have money and possible lawsuit costs, but if history on earth is on the line? Especially since such a known AI genius would find plenty support from people who supported such open move.
So I hope he assumes talking right NOW it not considered strategically worth it. E.g. He might want to increase his chance to be hired by semi safety serious company (more serious than Open AI, but not enough to hire a proven whistleblower), where he can use his position better.
I agree with what you say in the first paragraph. If you’re talking about Ilya, which I think you are, I can see what you mean in the second paragraph, but I’d flag that even if he had some sort of plan here, it seems pretty costly and also just bad norms for someone with his credibility to say something that indicates that he thinks OpenAI is on track to do well at handling their great responsibility, assuming he were to not actually believe this. It’s one thing to not say negative things explicitly; it’s a different thing to say something positive that rules out the negative interpretations. I tend to take people at their word if they say things explicitly, even if I can assume that they were facing various pressures. If I were to assume that Ilya is saying positive things that he doesn’t actually believe, that wouldn’t reflect well on him, IMO.
If we consider Jan Leike’s situation, I think what you’re saying applies more easily, because him leaving without comment already reflects poorly on OpenAI’s standing on safety, and maybe he just decided that saying something explicitly doesn’t really add a ton of information (esp. since maybe there are other people who might be in a better position to say things in the future). Also, I’m not sure it affects future employment prospects too much if someone leaves a company, signs a non-disparagement agreement, and goes “no comment” to indicate that there was probably dissatisfaction with some aspects of the company. There are many explanations for this and if I was making hiring decisions at some AI company, even if it’s focused on profits quite a bit, I wouldn’t necessarily interpret this as a negative signal.
That said, signing non-disparagament agreements certainly feels like it has costs and constrains option value, so it seems like a tough choice.
Well, one big reason is if they were prevented from doing the things they thought would constitute using their position of power to do good, or were otherwise made to feel that OpenAI wasn’t a good environment for them.
It is hard to pinpoint motivation here. If you are a top researcher at a top lab working on alignment and you disagree with something within the company, I see two categories of options you can take to try to fix things
Stay and try to use your position of power to do good. Better that someone who deeply cares about AI risk is in charge than someone who doesn’t
Leave in protest to try to sway public opinion into thinking that your organization is unsafe and that we should not trust it
Jan and Ilya left but haven’t said much about how they lost confidence in OpenAI. I expect we will see them making more damning statements about OpenAI in the future
Or is there a possible motivation I’m missing here?
It seems likely (though not certain) that they signed non-disparagement agreements, so we may not see more damning statements from them even if that’s how they feel. Also, Ilya at least said some positive things in his leaving announcement, so that indicates either that he caved in to pressure (or too high agreeableness towards former co-workers) or that he’s genuinely not particularly worried about the direction of the company and that he left more because of reasons related to his new project.
Someone serious about alignment seeing dangers better do what is save and not be influenced by a non-disparagement agreement. It might lose them some job prospects and have money and possible lawsuit costs, but if history on earth is on the line? Especially since such a known AI genius would find plenty support from people who supported such open move.
So I hope he assumes talking right NOW it not considered strategically worth it. E.g. He might want to increase his chance to be hired by semi safety serious company (more serious than Open AI, but not enough to hire a proven whistleblower), where he can use his position better.
I agree with what you say in the first paragraph. If you’re talking about Ilya, which I think you are, I can see what you mean in the second paragraph, but I’d flag that even if he had some sort of plan here, it seems pretty costly and also just bad norms for someone with his credibility to say something that indicates that he thinks OpenAI is on track to do well at handling their great responsibility, assuming he were to not actually believe this. It’s one thing to not say negative things explicitly; it’s a different thing to say something positive that rules out the negative interpretations. I tend to take people at their word if they say things explicitly, even if I can assume that they were facing various pressures. If I were to assume that Ilya is saying positive things that he doesn’t actually believe, that wouldn’t reflect well on him, IMO.
If we consider Jan Leike’s situation, I think what you’re saying applies more easily, because him leaving without comment already reflects poorly on OpenAI’s standing on safety, and maybe he just decided that saying something explicitly doesn’t really add a ton of information (esp. since maybe there are other people who might be in a better position to say things in the future). Also, I’m not sure it affects future employment prospects too much if someone leaves a company, signs a non-disparagement agreement, and goes “no comment” to indicate that there was probably dissatisfaction with some aspects of the company. There are many explanations for this and if I was making hiring decisions at some AI company, even if it’s focused on profits quite a bit, I wouldn’t necessarily interpret this as a negative signal.
That said, signing non-disparagament agreements certainly feels like it has costs and constrains option value, so it seems like a tough choice.
Well, one big reason is if they were prevented from doing the things they thought would constitute using their position of power to do good, or were otherwise made to feel that OpenAI wasn’t a good environment for them.
Leaving to dissuade others within the company is another possibility
I assume they can’t make a statement and that their choice of next occupation will be the clearest signal they can and will send out to the public.