Tristan Wegner comments on Ilya Sutskever and Jan Leike resign from OpenAI [updated]

Tristan Wegner 15 May 2024 14:48 UTC
14 points
6
Someone serious about alignment seeing dangers better do what is save and not be influenced by a non-disparagement agreement. It might lose them some job prospects and have money and possible lawsuit costs, but if history on earth is on the line? Especially since such a known AI genius would find plenty support from people who supported such open move.
So I hope he assumes talking right NOW it not considered strategically worth it. E.g. He might want to increase his chance to be hired by semi safety serious company (more serious than Open AI, but not enough to hire a proven whistleblower), where he can use his position better.
- Lukas_Gloor 15 May 2024 15:17 UTC
  10 points
  3
  Parent
  I agree with what you say in the first paragraph. If you’re talking about Ilya, which I think you are, I can see what you mean in the second paragraph, but I’d flag that even if he had some sort of plan here, it seems pretty costly and also just bad norms for someone with his credibility to say something that indicates that he thinks OpenAI is on track to do well at handling their great responsibility, assuming he were to not actually believe this. It’s one thing to not say negative things explicitly; it’s a different thing to say something positive that rules out the negative interpretations. I tend to take people at their word if they say things explicitly, even if I can assume that they were facing various pressures. If I were to assume that Ilya is saying positive things that he doesn’t actually believe, that wouldn’t reflect well on him, IMO.
  
  If we consider Jan Leike’s situation, I think what you’re saying applies more easily, because him leaving without comment already reflects poorly on OpenAI’s standing on safety, and maybe he just decided that saying something explicitly doesn’t really add a ton of information (esp. since maybe there are other people who might be in a better position to say things in the future). Also, I’m not sure it affects future employment prospects too much if someone leaves a company, signs a non-disparagement agreement, and goes “no comment” to indicate that there was probably dissatisfaction with some aspects of the company. There are many explanations for this and if I was making hiring decisions at some AI company, even if it’s focused on profits quite a bit, I wouldn’t necessarily interpret this as a negative signal.
  
  That said, signing non-disparagament agreements certainly feels like it has costs and constrains option value, so it seems like a tough choice.