Absolutely not. When I make an agreement to work closely with you on a crucial project, if I think you’re deceiving me, I will let you know. I will not surprise backstab you and get on with my day. I will tell you outright and I will say it loudly. I may move quickly to disable you if it’s an especially extreme circumstance but I will acknowledge that this is a cost to our general cooperative norms where people are given space to respond even if I assign a decent chance to them behaving poorly. Furthermore I will provide evidence and argument in response to criticism of my decision by other stakeholders who are shocked and concerned by it.
Shame on you for suggesting only your tribe knows or cares about honoring partnerships with people after you’ve lost trust in them. Other people know what’s decent too.
That does not seem sensible to me. We’re not demons for the semantics of our contracts to be binding. If you’ve entered an agreement with someone, and later learned that they intend (and perhaps have always intended) to exploit your acting in accordance with it to screw you over, it seems both common-sensically and game-theoretically sound to consider the contract null and void, since it was agreed-to based on false premises.
If the other person merely turned out to be generically a bad/unpleasant person, with no indication they’re planning to specifically exploit their contract with you? Then yes, backstabbing them is dishonorable.
But if you’ve realized, with high confidence, that they’re not actually cooperating with you the way you understood them to be, and they know they’re not cooperating with you the way you expect them to be, and they deliberately maintain your misunderstanding in order to exploit you? Then, I feel, giving them advance warning is just pointless self-harm in the name of role-playing being cooperative; not a move to altruistically preserve social norms.
If you’ve entered an agreement with someone, and later learned that they intend (and perhaps have always intended) to exploit your acting in accordance with it to screw you over, it seems both common-sensically and game-theoretically sound to consider the contract null and void, since it was agreed-to based on false premises.
If you make a trade agreement, and the other side does not actually pay up, then I do not think you are bound to provide the good anyway. It was a trade.
If you make a commitment, and then later come to realize that in requesting that commitment the other party was actually taking advantage of you, I think there are a host of different strategies one could pick. I think my current ideal solution is “nonetheless follow-through on your commitment, but make them pay for it in some other way”, but I acknowledge that there are times when it’s correct pick other strategies like “just don’t do it and when anyone asks you why give them a straight answer” and more.
Your strategy in a given domain will also depend on all sorts of factors like how costly the commitment is, how much they’re taking advantage of you for, what recourse you have outside of the commitment (e.g. if they’ve broken the law they can be prosecuted, but in other cases it is harder to punish them).
The thing I currently believe and want to say here is that it is not good to renege on commitments even if you have reason to, and it is better to not renege on them while setting the incentives right. It can be the right choice to do so in order to set the incentives right, but even when it’s the right call I want to acknowledge that this is a cost to our ability to trust in people’s commitments.
Sure, I agree with all of the local points you’ve stated here (“local” as in “not taking account the particulars of the recent OpenAI drama”). For clarity, my previous disagreement was with the following local claim:
When I make an agreement to work closely with you on a crucial project, if I think you’re deceiving me, I will let you know
In my view, “knowingly executing your part of the agreement in a way misaligned with my understanding of how that part is to be executed” counts as “not paying up in a trade agreement”, and is therefore grounds for ceasing to act in accordance with the agreement on my end too. From this latest comment, it sounds like you’d agree with that?
Reading the other branch of this thread, you seem to disagree that that was the situation in which the OpenAI board had been. Sure, I’m hardly certain of this myself. However, if it were, and if they were highly certain of being in that position, I think their actions are fairly justified.
My understanding is that OpenAI’s foundational conceit was prioritizing safety over profit/power-pursuit, and that their non-standard governance structure was explicitly designed to allow the board to take draconian measures if they concluded the company went astray. Indeed, going off these sort of disclaimers or even the recent actions, it seems they were hardly subtle or apologetic about such matters.
“Don’t make us think that you’d diverge from our foundational conceit if given the power to, or else” was part of the deal Sam Altman effectively signed by taking the CEO role. And if the board had concluded that this term was violated, then taking drastic and discourteous measures to remove him from power seems entirely fine to me.
Paraphrasing: while in the general case of deal-making, a mere “l lost trust in you” is not reasonable grounds for terminating the deal, my understanding is that “we have continued trust in you” was part of this specific deal, meaning losing trust was reasonable grounds for terminating this specific deal.
I acknowledge, though, that it’s possible I’m extrapolating from the “high-risk investment” disclaimer plus their current actions incorrectly, that the board had actually failed to communicate this to Sam when hiring him. Do we have cause to believe that, though?
When I make an agreement to work closely with you on a crucial project,
I agree that there are versions of “agreeing to work closely together on the crucial project” where I see this as “speak up now or otherwise allow this person into your circle of trust.” Once someone is in that circle, you cannot kick them out without notice just because you think you observed stuff that made you change your mind – if you could do that, it wouldn’t work as a circle of trust.
So, there are circumstances where I’d agree with you. Whether the relationship between a board member and a CEO should be like that could be our crux here. I’d say yes in the ideal, but was it like that for the members of the board and Altman? I’d say it depends on the specific history. And my guess is that, no, there was no point where the board could have said “actually we’re not yet sure we want to let Altman into our circle of trust, let’s hold off on that.” And there’s no yes without the possibility of no.
if I think you’re deceiving me, I will let you know.
I agree that one needs to do this if one lost faith in people who once made it into one’s circle of trust. However, let’s assume they were never there to begin with. Then, it’s highly unwise if you’re dealing with someone without morals who feels zero obligations towards you in response. Don’t give them an advance warning out of respect or a sense of moral obligation. If your mental model of the person is “this person will internally laugh at you for being stupid enough to give them advance warning and will gladly use the info you gave against you,” then it would be foolish to tell them. Batman shouldn’t tell the joker that he’s coming for him.
I may move quickly to disable you if it’s an especially extreme circumstance but I will acknowledge that this is a cost to our general cooperative norms where people are given space to respond even if I assign a decent chance to them behaving poorly.
What I meant to say in my initial comment is the same thing as you’re saying here.
“Acknowledging the cost” is also an important thing in how I think about it, (edit) but I see that cost as not being towards the Joker (respect towards him), but towards the broader cooperative fabric. [Edit: deleted a passage here because it was long-winded.]
”If I assign a decent chance to them behaving poorly” – note that in my description, I spoke of practical certainty, not just “a decent chance that.” Even in contexts where I think mutual expectations of trustworthiness and cooperativeness are lower than in what I call “circles of trust,” I’m all in favor of preserving respect up until way past the point where you’re just a bit suspicious of someone. It’s just that, if the stakes are high and if you’re not in a high-trust relationship with the person (i.e., you don’t have a high prior that they’re for sure cooperating back with you), there has to come a point where you’ll stop giving them free information that could harm you.
I admit this is a step in the direction of act utilitarianism, and act utilitarianism is a terrible, wrong ideology. However, I think it’s only a step and not all the way, and there’s IMO a way to codify rules/virtues where it’s okay to take these steps and you don’t get into a slippery slope. We can have a moral injunction where we’d only make such moves against other people if our confidence is significantly higher than it needs to be on mere act utilitarian grounds. Basically, you either need smoking-gun evidence of something sufficiently extreme, or need to get counsel from other people and see if they agree to filter out unilateralism in your judgment, or have other solutions/safety-checks like that before allowing yourself to act.
I think what further complicates the issue is that there are “malefactor types” who are genuinely concerned about doing the right thing and where it looks like they’re capable of cooperating with people in their inner circle, but then they are too ready to make huge rationalization-induced updates (almost like “splitting” in BPD) that the other party was bad throughout all along and is now out of the circle. Their inner circle is way too fluid and their true circle of trust is only themselves. The existence of this phenotype means that if someone like that tries to follow the norms I just advocated, they will do harm. How do I incorporate this into my suggested policy? I feel like this is analogous to discussions about modest epistemology vs non-modest epistemology. What if you’re someone who’s deluded to think he’s Napolean/some genius scientist? If someone is deluded like that, non-modest epistemology doesn’t work. To this, I say “epistemology is only helpful if you’re not already hopelessly deluded.” Likewise, what if your psychology is hopelessly self-deceiving and you’ll do on-net-harmful self-serving things even when you try your best not to do them? Well, sucks to be you (or rather, sucks for other people that you exist), but that doesn’t mean that the people with a more trust-compatible psychology have to change the way they go about building a fabric of trust that importantly also has to be protected against invasion from malefactors.
I actually think it’s a defensible position to say that the temptation to decide who is or isn’t “trustworthy” is too big and humans need moral injunctions and that batman should give the joker an advance warning, so I’m not saying you’re obviously wrong here, but I think my view is defensible as well, and I like it better than yours and I’ll keep acting in accordance with it. (One reason I like it better is because if I trust you and you play “cooperate” with someone who only knows deception and who moves against you and your cooperation partners and destroys a ton of value, then I shouldn’t have trusted you either. Being too undiscriminately cooperative makes you less trustworthy in a different sort of way.)
Shame on you for suggesting only your tribe knows or cares about honoring partnerships with people after you’ve lost trust in them. Other people know what’s decent too.
I think there’s something off about the way you express whatever you meant to express here – something about how you’re importing your frame of things over mine and claim that I said something in the language of your frame, which makes it seem more obviously bad/”shameful” than if you expressed it under my frame.
[Edit: November 22nd, 20:46 UK time. Oh, I get it now. You totally misunderstood what I meant here! I was criticizing EAs for doing this too naively. I was not praising the norms of my in-group (EA). Your reply actually confused me so much that I thought you were being snarky at me in some really strange way. Like, I thought you knew I was criticizing EAs. I guess you might identify as more of a rationalist than an EA, so I should have said “only EAs and rationalists” to avoid confusion. And like I say below, this was somewhat hyperbolic.]
In any case, I’d understand it if you said something like “shame on your for disclosing to the world that you think of trust in a way that makes you less trustworthy (according to my, Ben’s, interpretation).” If that’s what you had said, I’m now replying that I hope that you no longer think this after reading what I elaborated above.
Edit: And to address the part about “your tribe” – okay, I was being hyperbolic about only EAs having a tendency to be (what-I-consider-to-be) naive when it comes to applying norms of cooperation. It’s probably also common in other high-trust ideological communities. I think it actually isn’t very common in Silicon Valley, which very much supports my point here. When people get fired or backstabbed over startup drama (I’m thinking of the movie The Social Network), they are not given three months adjustment period where nothing really changes except that they now know what’s coming. Instead, they have their privileges revoked and passwords changed and have to leave the building.I think focusing over how much notice someone has given is more a part of the power struggle and war over who has enough leverage to get others on their side, than it is genuinely about “this particular violation of niceness norms is so important that it deserves to be such a strong focus of this debate.” Correspondingly, I think people would complain a lot less about how much notice was given if the board had done a better job convincing others that their concerns were fully justified. (Also, Altman himself certainly wasn’t going to give Helen a lot of time still staying on the board and adjusting to the upcoming change, still talking to others about her views and participating in board stuff, etc., when he initially thought he could get rid of her.)
I find the situation a little hard to talk about concretely because whatever concrete description I give will not be correct (because nobody involved is telling us what happened).
Nonetheless, let us consider the most uncharitable narrative regarding Altman here, where members of the board come to believe he is a lizard, a person who is purely selfish and who has no honor. (To be clear I do not think this is accurate, I am using it for communication purposes.) Here are some rules.
Just because someone is a lizard, does not make it okay to lie to them
Just because someone is a lizard, does not make it okay to go back on agreements with them
While the lizard had the mandate to make agreements and commitments on behalf of your company, it is not now okay to disregard those agreements and commitments
The situation must not be “I’ll treat you honorably if I think you’re a good person, but the moment I decide you’re a lizard then I’ll act with no honor myself.” The situation must be “I will treat you honorably because it is right to be honorable.” Otherwise the honor will seep out of the system as probabilities we assign to others’ honor wavers.
I think it is damaging to the trust people place in board members, to see them act with so little respect or honor. It reduces everyone’s faith in one another to see people in powerful positions behave badly.
II.
I respect that in response to my disapproval of your statement, you took the time to explain in detail the reasoning behind your comment and communicate some more of your perspective on the relevant game theory. I think that generally helps, when folks are having conflicts, to examine openly the reasons why decisions were made and investigate those. And it also gives us more surface area to locate key parts of the disagreement.
I still disagree with you. I think it was an easy-and-wrong thing to suggest that only people in the EA tribe would care about this important ethical principle I care about. But I am glad we’re exploring this rather than just papering over it, or just being antagonistic, or just leaving.
III.
CEO:
“Suppose you come to the conclusion that I’m a lizard. Will you give me no chance for a rebuttal, and fire me immediately, without giving our business partners notice, and never give me a set of reasons, and never tell our staff a set of reasons?”
Prospective Board Member:
“No you can be confident that I would not do that.We would conduct an investigation, and at that time bar your ability to affect the board. We would be candid with the staff about our concerns, and we would not wantonly harm the agreements you made with your business partners.”
CEO
“But what if you came to believe that I was maneuvering to remove power from you within days?”
Prospective Board Member:
“I think there are worlds where I would take sudden action. I could see myself voting to remove you from the board while the investigation is under way, and letting the staff and business partners know that we’re investigating you and a possible outcome is you being fired.”
Contracts are filled with many explicit terms and agreements, but I also believe they ~all come with an implicit one: in making this deal we are agreeing not to screw each other over. I think if they would have thought that this sudden-firing-without-cause and not explaining anything to the staff would be screwing Altman over when accepting the board seat, and if they did not bring up this sort of action as a thing they might do before it was time to do so, then they should not have done it.
IV .
I agree that there are versions of “agreeing to work closely together on the crucial project” where I see this as “speak up now or otherwise allow this person into your circle of trust.” Once someone is in that circle, you cannot kick them out without notice just because you think you observed stuff that made you change your mind – if you could do that, it wouldn’t work as a circle of trust.
I don’t think this is a “circle of trust”. I think accepting a board seat is an agreement. It is an agreement to be given responsibility, and to use it well and in accordance with good principles. I think there is a principle to give someone a chance to respond and be open about why you are destroying their lives and company before you do so, regardless of context, and you don’t forgo that just because they are acting antagonistically toward you. Barring illegal acts or acts of direct violence, you should give someone a chance to respond and be open about why you destroy everything they’ve built.
Batman shouldn’t tell the joker that he’s coming for him.
The Joker had killed many people when the Batman came for him. From many perspectives this is currently primarily a disagreement over managing a lot of money and a great company. These two are not similar.
Perhaps you wish to say that Altman is in an equivalent moral position, as his work is directly responsible for an AI takeover (as I believe), similar in impact to an extinction event. I think if Toner/MacAulay/etc believed this, then they should have said this openly far, far earlier, so that their counter-parties in this conflict (and everyone else in the world) were aware of the rules at play.
I don’t believe that any of them said this before they were given board seats.
V.
In the most uncharitable case (toward Altman) where they believed he was a lizard, they should probably have opened up an investigation before firing him, and taken some action to prevent him from outvoting them (e.g. just removed him from the board, or added an extra board member).
They claim to have done a thorough investigation. Yet it has produced no written results and they could not provide any written evidence to Emmett Shear. So I do not believe they have done a proper investigation or produced any evidence to others. If they can produce ~no evidence to others, then they should cast a vote of no confidence, fire Altman, implement a new CEO, implement a new board, and quit. I would have respected them more if they had also stated that they did not act honorably in ousting Altman and would be looking for a new board to replace them.
You can choose to fire someone for misbehavior even when you have no legible evidence of misbehavior. But then you have to think about how you can gain the trust of the next person who comes along, who understands you fired the last person with no clear grounds.
VI.
Lukas: I think it’s a thing that only EAs would think up that it’s valuable to be cooperative towards people who you’re convinced are deceptive/lack integrity.
Ben: Shame on you for suggesting only your tribe knows or cares about honoring partnerships with people after you’ve lost trust in them. Other people know what’s decent too.
Lukas: I think there’s something off about the way you express whatever you meant to express here – something about how you’re importing your frame of things over mine and claim that I said something in the language of your frame, which makes it seem more obviously bad/”shameful” than if you expressed it under my frame.
In any case, I’d understand it if you said something like “shame on your for disclosing to the world that you think of trust in a way that makes you less trustworthy (according to my, Ben’s, interpretation).” If that’s what you had said, I’m now replying that I hope that you no longer think this after reading what I elaborated above.
I keep reading this and not understanding your last reply. I’ll rephrase my understanding of our positions.
I think you view the board firing situation as thus: some people, who didn’t strongly trust Altman, were given the power to oust him, came to think he’s a lizard (with zero concrete evidence), and then just got rid of him.
I’m saying that even if that’s true, they should have acted more respectfully to him and honored their agreement to wield the power with care,so should have given him notice and done a proper investigation (again given they have zero concrete evidence).
I think that you view me as trying to extend the principle of charity arbitrarily far (to the point of self-harm), and so you’re calling me too naive and cooperating, a lizard’s a lizard, just destroy it.
I’m saying that you should honor the agreement you’ve made to wield your power well and not cruelly or destructively. It seems to me that it has likely been wielded very aggressively and in a way where I cannot tell that it was done justly. A man was told on Friday that he had been severed from the ~$100B company he had built. He was given no cause, the company was given no cause, it appears as if there was barely any clear cause, and there was no way to make the decision right (were it a mistake). This method currently seems to me both a little cruel and a little power-hungry/unjust, even when I assume the overall call is the correct one.
For you to say that I’m just another EA who is playing cooperate bot lands with me as (a) inaccurately calling me naive and rounding off my position to a stupider one (b) disrespecting all the other people in the world who care about people wielding power well, and (c) kind of saying your tribe is the only one with good people in it. Which I think is a pretty inappropriate reply.
I have provided some rebuttals on a bunch of specific points above. Sorry for the too-long comment.
I see interesting points on both sides here. Something about how this comment(s) is expressed makes me feel uncomfortable, like this isn’t the right tone for exploring disagreements about correct moral/cooperative behavior, it at least it makes it a lot harder for me to participate. I think it’s something like it feels like performing moral outrage/indignation in a way that feels more persuadey than explainy, and more in the direction of social pressure, norms-enforcery. The phrase “shame on you” is a particularly clear thing I’ll point at that makes me perceive this.
a) A lot of your points are specifically about Altman and the board, whereas many of my points started that way but then went into the abstract/hypothetical/philosophical. At least, that’s how I meant it – I should have made this more clear. I was assuming, for the sake of the argument, that we’re speaking of a situation where the person in the board’s position found out that someone else is deceptive to their very core, with no redeeming principles they adhere to. So, basically what you’re describing in your point “I” with the lizardpeople. I focused on that type of discussion because I felt like you were attacking my principles, and I care about defending my specific framework of integrity. (I’ve commented elsewhere on things that I think the board should or shouldn’t have done, so I also care about that, but I probably already spent too many comments on speculations about the board’s actions.) Specifically about the actual situation with Altman, you say: “I’m saying that you should honor the agreement you’ve made to wield your power well and not cruelly or destructively. It seems to me that it has likely been wielded very aggressively and in a way where I cannot tell that it was done justly.” I very much agree with that, fwiw. I think it’s very possible that the board did not act with integrity here. I’m just saying that I can totally see circumstances where they did act with integrity. The crux for me is “what did they believe about Altman and how confident were they in their take, and did they make an effort to factor in moral injunctions against using their power in a self-serving way, etc?”
b) You make it seem like I’m saying that it’s okay to move against people (and e.g. oust them) without justifying yourself later or giving them the chance to reply at some point later when they’re in a less threatening position. I think we’re on the same page about this: I don’t believe that it would be okay to do these things. I wasn’t saying that you don’t have to stand answer to what you did. I was just saying that it can, under some circumstances, be okay to act first and then explain yourself to others later and establish yourself as still being trustworthy.
c) About your first point (point “I”), I disagree. I think you’re too deontological here. Numbers do count. Being unfair to someone who you think is a bad actor but turns out they aren’t has a victim count of one. Letting a bad actor take over the startup/community/world you care about has a victim count of way more than one. I also think it can be absolutely shocking how high this can go (in terms of various types of harms caused by the bad tail of bad actors) depending on the situation. E.g., think of Epstein or dictators. On top of that, there are indirect bad effects that don’t quite fit the name “victim count” but that still weigh heavily, such as distorted epistemics or destruction of a high-trust environment when it gets invaded by bad actors. Concretely, I feel like when you talk about the importance of the variable “respect towards Altman in the context of how much notice to give him,” I’m mostly thinking, sure, it would be nice to be friendly and respectful, but that’s a small issue compared to considerations like “if the board is correct, how much could he mobilize opposition against them if he had a longer notice period?” So, I thought three months notice would be inappropriate given what’s asymmetrically at stake on both sides of the equation. (It might change once we factor in optics and how it’ll be easier for Altman to mobilize opposition if he can say he was treated unfairly – for some reason, this always works wonders. DARVO is like dark magic. Sure, it sucks for Altman to lose a 100 billion company that they built. But an out-of-control CEO recklessly building the most dangerous tech in the world sucks more for way more people in expectation.) In the abstract, I think it would be an unfair and inappropriate sense of what matters if a single person who is accused of being a bad actor gets more respect than their many victims would suffer in expectation. And I’m annoyed that it feels like you took the moral high ground here by making it seem like my positions are immoral. But maybe you meant the “shame on yourself” for just one isolated sentence, and not my stance as a whole. I’d find that more reasonable. In any case, I understand now that you probably feel bothered for an analogous reason, namely that I made a remark about how it’s naive to be highly charitable or cooperative under circumstances where I think it’s no longer appropriate. I want to flag that nothing you wrote in your newest reply seems naive to me, even though I do find it misguided. (The only thing that I thought was maybe naive was the point about three months notice – though I get why you made it and I generally really appreciate examples like that about concrete things the board could have done. I just think it would backfire when someone would use these months to make moves against you.)
d) The “shame on yourself” referred to something where you perceived me to be tribal, but I don’t really get what that was about. You write “and (c) kind of saying your tribe is the only one with good people in it.” This is not at all what I was kind of saying. I was saying my tribe is the only one with people who are “naive in such-and-such specific way” in it, and yeah, that was unfair towards EAs, but then it’s not tribal (I self-identify as EA), and I feel like it’s okay to use hyperbole this way sometimes to point at something that I perceive to be a bit of a problem in my tribe. In any case, it’s weirdly distorting things when you then accuse me of something that only makes sense if you import your frame on what I said. I didn’t think of this as being a virtue, so I wasn’t claiming that other communities don’t also have good people.
e) Your point “III” reminds me of this essay by Eliezer titled “Meta-Honesty: Firming Up Honesty Around Its Edge-Cases.” Just like Eliezer in that essay explains that there are circumstances where he thinks you can hide info or even deceive, there are circumstances where I think you can move against someone and oust them without advance notice. If a prospective CEO interviews me as a board member, I’m happy to tell them exactly under which circumstances I would give them advance notice (or things like second and third chances) and under which ones I wouldn’t. (This is what reminded me of the essay and the dialogues with the Gestapo officer.) (That said, I’d decline the role because I’d probably have overdosed on anxiety medication if I had been in the OpenAI board’s position.) The circumstances would have to be fairly extreme for me not to give advanced warnings or second chances, so if a CEO thinks I’m the sort of person who doesn’t have a habit of interpreting lots of things in a black-and-white and uncharitable manner, then they wouldn’t have anything to fear if they’re planning on behaving well and are at least minimally skilled at trust-building/making themselves/their motives/reasons for actions transparent.
f) You say: “I think it is damaging to the trust people place in board members, to see them act with so little respect or honor. It reduces everyone’s faith in one another to see people in powerful positions behave badly.” I agree that it’s damaging, but the way I see it, the problem here is the existence of psychopaths and other types of “bad actors” (or “malefactors”). They are why issues around trust and trustworthiness are sometimes so vexed and complicated. It would be wonderful if such phenotypes didn’t exist, but we have to face reality. It doesn’t actually help “the social fabric/fabric of trust” if one lends too much trust to people who abuse it to harm others and add more deception. On the contrary, it makes things worse.
g) I appreciate what you say in the first paragraph of your point IV! I feel the same way about this. (I should probably have said this earlier in my reply, but I’m about to go to sleep and so don’t want to re-alphabetize all of the points.)
I’m uncomfortable reading this comment. I believe you identified as an EA for much of your adult life, and the handle “EA” gives me lots of the bits that distinguish you from the general population. But you take for granted that Lukas’ “EA” is meant to exclude you.
By contrast, I would not have felt uncomfortable if you had been claiming adherence to the same standards by your childhood friends, startup culture, or some other clearly third party group.
Oh, but I don’t mean to say that Lukas was excluding me. I mean he was excluding all other people who exist who would also care about honoring partnerships after losing faith in the counter-party, of which there are more than just me, and more than just EAs.
I think it was clear from context that Lukas’ “EAs” was intentionally meant to include Ben, and is also meant as a gentle rebuke re: naivete, not a serious claim re: honesty.
Absolutely not. When I make an agreement to work closely with you on a crucial project, if I think you’re deceiving me, I will let you know. I will not surprise backstab you and get on with my day. I will tell you outright and I will say it loudly. I may move quickly to disable you if it’s an especially extreme circumstance but I will acknowledge that this is a cost to our general cooperative norms where people are given space to respond even if I assign a decent chance to them behaving poorly. Furthermore I will provide evidence and argument in response to criticism of my decision by other stakeholders who are shocked and concerned by it.
Shame on you for suggesting only your tribe knows or cares about honoring partnerships with people after you’ve lost trust in them. Other people know what’s decent too.
That does not seem sensible to me. We’re not demons for the semantics of our contracts to be binding. If you’ve entered an agreement with someone, and later learned that they intend (and perhaps have always intended) to exploit your acting in accordance with it to screw you over, it seems both common-sensically and game-theoretically sound to consider the contract null and void, since it was agreed-to based on false premises.
If the other person merely turned out to be generically a bad/unpleasant person, with no indication they’re planning to specifically exploit their contract with you? Then yes, backstabbing them is dishonorable.
But if you’ve realized, with high confidence, that they’re not actually cooperating with you the way you understood them to be, and they know they’re not cooperating with you the way you expect them to be, and they deliberately maintain your misunderstanding in order to exploit you? Then, I feel, giving them advance warning is just pointless self-harm in the name of role-playing being cooperative; not a move to altruistically preserve social norms.
If you make a trade agreement, and the other side does not actually pay up, then I do not think you are bound to provide the good anyway. It was a trade.
If you make a commitment, and then later come to realize that in requesting that commitment the other party was actually taking advantage of you, I think there are a host of different strategies one could pick. I think my current ideal solution is “nonetheless follow-through on your commitment, but make them pay for it in some other way”, but I acknowledge that there are times when it’s correct pick other strategies like “just don’t do it and when anyone asks you why give them a straight answer” and more.
Your strategy in a given domain will also depend on all sorts of factors like how costly the commitment is, how much they’re taking advantage of you for, what recourse you have outside of the commitment (e.g. if they’ve broken the law they can be prosecuted, but in other cases it is harder to punish them).
The thing I currently believe and want to say here is that it is not good to renege on commitments even if you have reason to, and it is better to not renege on them while setting the incentives right. It can be the right choice to do so in order to set the incentives right, but even when it’s the right call I want to acknowledge that this is a cost to our ability to trust in people’s commitments.
Sure, I agree with all of the local points you’ve stated here (“local” as in “not taking account the particulars of the recent OpenAI drama”). For clarity, my previous disagreement was with the following local claim:
In my view, “knowingly executing your part of the agreement in a way misaligned with my understanding of how that part is to be executed” counts as “not paying up in a trade agreement”, and is therefore grounds for ceasing to act in accordance with the agreement on my end too. From this latest comment, it sounds like you’d agree with that?
Reading the other branch of this thread, you seem to disagree that that was the situation in which the OpenAI board had been. Sure, I’m hardly certain of this myself. However, if it were, and if they were highly certain of being in that position, I think their actions are fairly justified.
My understanding is that OpenAI’s foundational conceit was prioritizing safety over profit/power-pursuit, and that their non-standard governance structure was explicitly designed to allow the board to take draconian measures if they concluded the company went astray. Indeed, going off these sort of disclaimers or even the recent actions, it seems they were hardly subtle or apologetic about such matters.
“Don’t make us think that you’d diverge from our foundational conceit if given the power to, or else” was part of the deal Sam Altman effectively signed by taking the CEO role. And if the board had concluded that this term was violated, then taking drastic and discourteous measures to remove him from power seems entirely fine to me.
Paraphrasing: while in the general case of deal-making, a mere “l lost trust in you” is not reasonable grounds for terminating the deal, my understanding is that “we have continued trust in you” was part of this specific deal, meaning losing trust was reasonable grounds for terminating this specific deal.
I acknowledge, though, that it’s possible I’m extrapolating from the “high-risk investment” disclaimer plus their current actions incorrectly, that the board had actually failed to communicate this to Sam when hiring him. Do we have cause to believe that, though?
I agree that there are versions of “agreeing to work closely together on the crucial project” where I see this as “speak up now or otherwise allow this person into your circle of trust.” Once someone is in that circle, you cannot kick them out without notice just because you think you observed stuff that made you change your mind – if you could do that, it wouldn’t work as a circle of trust.
So, there are circumstances where I’d agree with you. Whether the relationship between a board member and a CEO should be like that could be our crux here. I’d say yes in the ideal, but was it like that for the members of the board and Altman? I’d say it depends on the specific history. And my guess is that, no, there was no point where the board could have said “actually we’re not yet sure we want to let Altman into our circle of trust, let’s hold off on that.” And there’s no yes without the possibility of no.
I agree that one needs to do this if one lost faith in people who once made it into one’s circle of trust. However, let’s assume they were never there to begin with. Then, it’s highly unwise if you’re dealing with someone without morals who feels zero obligations towards you in response. Don’t give them an advance warning out of respect or a sense of moral obligation. If your mental model of the person is “this person will internally laugh at you for being stupid enough to give them advance warning and will gladly use the info you gave against you,” then it would be foolish to tell them. Batman shouldn’t tell the joker that he’s coming for him.
What I meant to say in my initial comment is the same thing as you’re saying here.
“Acknowledging the cost” is also an important thing in how I think about it, (edit) but I see that cost as not being towards the Joker (respect towards him), but towards the broader cooperative fabric. [Edit: deleted a passage here because it was long-winded.]
”If I assign a decent chance to them behaving poorly” – note that in my description, I spoke of practical certainty, not just “a decent chance that.” Even in contexts where I think mutual expectations of trustworthiness and cooperativeness are lower than in what I call “circles of trust,” I’m all in favor of preserving respect up until way past the point where you’re just a bit suspicious of someone. It’s just that, if the stakes are high and if you’re not in a high-trust relationship with the person (i.e., you don’t have a high prior that they’re for sure cooperating back with you), there has to come a point where you’ll stop giving them free information that could harm you.
I admit this is a step in the direction of act utilitarianism, and act utilitarianism is a terrible, wrong ideology. However, I think it’s only a step and not all the way, and there’s IMO a way to codify rules/virtues where it’s okay to take these steps and you don’t get into a slippery slope. We can have a moral injunction where we’d only make such moves against other people if our confidence is significantly higher than it needs to be on mere act utilitarian grounds. Basically, you either need smoking-gun evidence of something sufficiently extreme, or need to get counsel from other people and see if they agree to filter out unilateralism in your judgment, or have other solutions/safety-checks like that before allowing yourself to act.
I think what further complicates the issue is that there are “malefactor types” who are genuinely concerned about doing the right thing and where it looks like they’re capable of cooperating with people in their inner circle, but then they are too ready to make huge rationalization-induced updates (almost like “splitting” in BPD) that the other party was bad throughout all along and is now out of the circle. Their inner circle is way too fluid and their true circle of trust is only themselves. The existence of this phenotype means that if someone like that tries to follow the norms I just advocated, they will do harm. How do I incorporate this into my suggested policy? I feel like this is analogous to discussions about modest epistemology vs non-modest epistemology. What if you’re someone who’s deluded to think he’s Napolean/some genius scientist? If someone is deluded like that, non-modest epistemology doesn’t work. To this, I say “epistemology is only helpful if you’re not already hopelessly deluded.” Likewise, what if your psychology is hopelessly self-deceiving and you’ll do on-net-harmful self-serving things even when you try your best not to do them? Well, sucks to be you (or rather, sucks for other people that you exist), but that doesn’t mean that the people with a more trust-compatible psychology have to change the way they go about building a fabric of trust that importantly also has to be protected against invasion from malefactors.
I actually think it’s a defensible position to say that the temptation to decide who is or isn’t “trustworthy” is too big and humans need moral injunctions and that batman should give the joker an advance warning, so I’m not saying you’re obviously wrong here, but I think my view is defensible as well, and I like it better than yours and I’ll keep acting in accordance with it. (One reason I like it better is because if I trust you and you play “cooperate” with someone who only knows deception and who moves against you and your cooperation partners and destroys a ton of value, then I shouldn’t have trusted you either. Being too undiscriminately cooperative makes you less trustworthy in a different sort of way.)
I think there’s something off about the way you express whatever you meant to express here – something about how you’re importing your frame of things over mine and claim that I said something in the language of your frame, which makes it seem more obviously bad/”shameful” than if you expressed it under my frame.
[Edit: November 22nd, 20:46 UK time. Oh, I get it now. You totally misunderstood what I meant here! I was criticizing EAs for doing this too naively. I was not praising the norms of my in-group (EA). Your reply actually confused me so much that I thought you were being snarky at me in some really strange way. Like, I thought you knew I was criticizing EAs. I guess you might identify as more of a rationalist than an EA, so I should have said “only EAs and rationalists” to avoid confusion. And like I say below, this was somewhat hyperbolic.]
In any case, I’d understand it if you said something like “shame on your for disclosing to the world that you think of trust in a way that makes you less trustworthy (according to my, Ben’s, interpretation).” If that’s what you had said, I’m now replying that I hope that you no longer think this after reading what I elaborated above.
Edit: And to address the part about “your tribe” – okay, I was being hyperbolic about only EAs having a tendency to be (what-I-consider-to-be) naive when it comes to applying norms of cooperation. It’s probably also common in other high-trust ideological communities. I think it actually isn’t very common in Silicon Valley, which very much supports my point here. When people get fired or backstabbed over startup drama (I’m thinking of the movie The Social Network), they are not given three months adjustment period where nothing really changes except that they now know what’s coming. Instead, they have their privileges revoked and passwords changed and have to leave the building. I think focusing over how much notice someone has given is more a part of the power struggle and war over who has enough leverage to get others on their side, than it is genuinely about “this particular violation of niceness norms is so important that it deserves to be such a strong focus of this debate.” Correspondingly, I think people would complain a lot less about how much notice was given if the board had done a better job convincing others that their concerns were fully justified. (Also, Altman himself certainly wasn’t going to give Helen a lot of time still staying on the board and adjusting to the upcoming change, still talking to others about her views and participating in board stuff, etc., when he initially thought he could get rid of her.)
1.
I find the situation a little hard to talk about concretely because whatever concrete description I give will not be correct (because nobody involved is telling us what happened).
Nonetheless, let us consider the most uncharitable narrative regarding Altman here, where members of the board come to believe he is a lizard, a person who is purely selfish and who has no honor. (To be clear I do not think this is accurate, I am using it for communication purposes.) Here are some rules.
Just because someone is a lizard, does not make it okay to lie to them
Just because someone is a lizard, does not make it okay to go back on agreements with them
While the lizard had the mandate to make agreements and commitments on behalf of your company, it is not now okay to disregard those agreements and commitments
The situation must not be “I’ll treat you honorably if I think you’re a good person, but the moment I decide you’re a lizard then I’ll act with no honor myself.” The situation must be “I will treat you honorably because it is right to be honorable.” Otherwise the honor will seep out of the system as probabilities we assign to others’ honor wavers.
I think it is damaging to the trust people place in board members, to see them act with so little respect or honor. It reduces everyone’s faith in one another to see people in powerful positions behave badly.
II.
I respect that in response to my disapproval of your statement, you took the time to explain in detail the reasoning behind your comment and communicate some more of your perspective on the relevant game theory. I think that generally helps, when folks are having conflicts, to examine openly the reasons why decisions were made and investigate those. And it also gives us more surface area to locate key parts of the disagreement.
I still disagree with you. I think it was an easy-and-wrong thing to suggest that only people in the EA tribe would care about this important ethical principle I care about. But I am glad we’re exploring this rather than just papering over it, or just being antagonistic, or just leaving.
III.
CEO:
Prospective Board Member:
CEO
Prospective Board Member:
Contracts are filled with many explicit terms and agreements, but I also believe they ~all come with an implicit one: in making this deal we are agreeing not to screw each other over. I think if they would have thought that this sudden-firing-without-cause and not explaining anything to the staff would be screwing Altman over when accepting the board seat, and if they did not bring up this sort of action as a thing they might do before it was time to do so, then they should not have done it.
IV .
I don’t think this is a “circle of trust”. I think accepting a board seat is an agreement. It is an agreement to be given responsibility, and to use it well and in accordance with good principles. I think there is a principle to give someone a chance to respond and be open about why you are destroying their lives and company before you do so, regardless of context, and you don’t forgo that just because they are acting antagonistically toward you. Barring illegal acts or acts of direct violence, you should give someone a chance to respond and be open about why you destroy everything they’ve built.
The Joker had killed many people when the Batman came for him. From many perspectives this is currently primarily a disagreement over managing a lot of money and a great company. These two are not similar.
Perhaps you wish to say that Altman is in an equivalent moral position, as his work is directly responsible for an AI takeover (as I believe), similar in impact to an extinction event. I think if Toner/MacAulay/etc believed this, then they should have said this openly far, far earlier, so that their counter-parties in this conflict (and everyone else in the world) were aware of the rules at play.
I don’t believe that any of them said this before they were given board seats.
V.
In the most uncharitable case (toward Altman) where they believed he was a lizard, they should probably have opened up an investigation before firing him, and taken some action to prevent him from outvoting them (e.g. just removed him from the board, or added an extra board member).
They claim to have done a thorough investigation. Yet it has produced no written results and they could not provide any written evidence to Emmett Shear. So I do not believe they have done a proper investigation or produced any evidence to others. If they can produce ~no evidence to others, then they should cast a vote of no confidence, fire Altman, implement a new CEO, implement a new board, and quit. I would have respected them more if they had also stated that they did not act honorably in ousting Altman and would be looking for a new board to replace them.
You can choose to fire someone for misbehavior even when you have no legible evidence of misbehavior. But then you have to think about how you can gain the trust of the next person who comes along, who understands you fired the last person with no clear grounds.
VI.
I keep reading this and not understanding your last reply. I’ll rephrase my understanding of our positions.
I think you view the board firing situation as thus: some people, who didn’t strongly trust Altman, were given the power to oust him, came to think he’s a lizard (with zero concrete evidence), and then just got rid of him.
I’m saying that even if that’s true, they should have acted more respectfully to him and honored their agreement to wield the power with care, so should have given him notice and done a proper investigation (again given they have zero concrete evidence).
I think that you view me as trying to extend the principle of charity arbitrarily far (to the point of self-harm), and so you’re calling me too naive and cooperating, a lizard’s a lizard, just destroy it.
I’m saying that you should honor the agreement you’ve made to wield your power well and not cruelly or destructively. It seems to me that it has likely been wielded very aggressively and in a way where I cannot tell that it was done justly. A man was told on Friday that he had been severed from the ~$100B company he had built. He was given no cause, the company was given no cause, it appears as if there was barely any clear cause, and there was no way to make the decision right (were it a mistake). This method currently seems to me both a little cruel and a little power-hungry/unjust, even when I assume the overall call is the correct one.
For you to say that I’m just another EA who is playing cooperate bot lands with me as (a) inaccurately calling me naive and rounding off my position to a stupider one (b) disrespecting all the other people in the world who care about people wielding power well, and (c) kind of saying your tribe is the only one with good people in it. Which I think is a pretty inappropriate reply.
I have provided some rebuttals on a bunch of specific points above. Sorry for the too-long comment.
I see interesting points on both sides here. Something about how this comment(s) is expressed makes me feel uncomfortable, like this isn’t the right tone for exploring disagreements about correct moral/cooperative behavior, it at least it makes it a lot harder for me to participate. I think it’s something like it feels like performing moral outrage/indignation in a way that feels more persuadey than explainy, and more in the direction of social pressure, norms-enforcery. The phrase “shame on you” is a particularly clear thing I’ll point at that makes me perceive this.
a) A lot of your points are specifically about Altman and the board, whereas many of my points started that way but then went into the abstract/hypothetical/philosophical. At least, that’s how I meant it – I should have made this more clear. I was assuming, for the sake of the argument, that we’re speaking of a situation where the person in the board’s position found out that someone else is deceptive to their very core, with no redeeming principles they adhere to. So, basically what you’re describing in your point “I” with the lizardpeople. I focused on that type of discussion because I felt like you were attacking my principles, and I care about defending my specific framework of integrity. (I’ve commented elsewhere on things that I think the board should or shouldn’t have done, so I also care about that, but I probably already spent too many comments on speculations about the board’s actions.)
Specifically about the actual situation with Altman, you say:
“I’m saying that you should honor the agreement you’ve made to wield your power well and not cruelly or destructively. It seems to me that it has likely been wielded very aggressively and in a way where I cannot tell that it was done justly.”
I very much agree with that, fwiw. I think it’s very possible that the board did not act with integrity here. I’m just saying that I can totally see circumstances where they did act with integrity. The crux for me is “what did they believe about Altman and how confident were they in their take, and did they make an effort to factor in moral injunctions against using their power in a self-serving way, etc?”
b) You make it seem like I’m saying that it’s okay to move against people (and e.g. oust them) without justifying yourself later or giving them the chance to reply at some point later when they’re in a less threatening position. I think we’re on the same page about this: I don’t believe that it would be okay to do these things. I wasn’t saying that you don’t have to stand answer to what you did. I was just saying that it can, under some circumstances, be okay to act first and then explain yourself to others later and establish yourself as still being trustworthy.
c) About your first point (point “I”), I disagree. I think you’re too deontological here. Numbers do count. Being unfair to someone who you think is a bad actor but turns out they aren’t has a victim count of one. Letting a bad actor take over the startup/community/world you care about has a victim count of way more than one. I also think it can be absolutely shocking how high this can go (in terms of various types of harms caused by the bad tail of bad actors) depending on the situation. E.g., think of Epstein or dictators. On top of that, there are indirect bad effects that don’t quite fit the name “victim count” but that still weigh heavily, such as distorted epistemics or destruction of a high-trust environment when it gets invaded by bad actors. Concretely, I feel like when you talk about the importance of the variable “respect towards Altman in the context of how much notice to give him,” I’m mostly thinking, sure, it would be nice to be friendly and respectful, but that’s a small issue compared to considerations like “if the board is correct, how much could he mobilize opposition against them if he had a longer notice period?” So, I thought three months notice would be inappropriate given what’s asymmetrically at stake on both sides of the equation. (It might change once we factor in optics and how it’ll be easier for Altman to mobilize opposition if he can say he was treated unfairly – for some reason, this always works wonders. DARVO is like dark magic. Sure, it sucks for Altman to lose a 100 billion company that they built. But an out-of-control CEO recklessly building the most dangerous tech in the world sucks more for way more people in expectation.) In the abstract, I think it would be an unfair and inappropriate sense of what matters if a single person who is accused of being a bad actor gets more respect than their many victims would suffer in expectation. And I’m annoyed that it feels like you took the moral high ground here by making it seem like my positions are immoral. But maybe you meant the “shame on yourself” for just one isolated sentence, and not my stance as a whole. I’d find that more reasonable. In any case, I understand now that you probably feel bothered for an analogous reason, namely that I made a remark about how it’s naive to be highly charitable or cooperative under circumstances where I think it’s no longer appropriate. I want to flag that nothing you wrote in your newest reply seems naive to me, even though I do find it misguided. (The only thing that I thought was maybe naive was the point about three months notice – though I get why you made it and I generally really appreciate examples like that about concrete things the board could have done. I just think it would backfire when someone would use these months to make moves against you.)
d) The “shame on yourself” referred to something where you perceived me to be tribal, but I don’t really get what that was about. You write “and (c) kind of saying your tribe is the only one with good people in it.” This is not at all what I was kind of saying. I was saying my tribe is the only one with people who are “naive in such-and-such specific way” in it, and yeah, that was unfair towards EAs, but then it’s not tribal (I self-identify as EA), and I feel like it’s okay to use hyperbole this way sometimes to point at something that I perceive to be a bit of a problem in my tribe. In any case, it’s weirdly distorting things when you then accuse me of something that only makes sense if you import your frame on what I said. I didn’t think of this as being a virtue, so I wasn’t claiming that other communities don’t also have good people.
e) Your point “III” reminds me of this essay by Eliezer titled “Meta-Honesty: Firming Up Honesty Around Its Edge-Cases.” Just like Eliezer in that essay explains that there are circumstances where he thinks you can hide info or even deceive, there are circumstances where I think you can move against someone and oust them without advance notice. If a prospective CEO interviews me as a board member, I’m happy to tell them exactly under which circumstances I would give them advance notice (or things like second and third chances) and under which ones I wouldn’t. (This is what reminded me of the essay and the dialogues with the Gestapo officer.) (That said, I’d decline the role because I’d probably have overdosed on anxiety medication if I had been in the OpenAI board’s position.)
The circumstances would have to be fairly extreme for me not to give advanced warnings or second chances, so if a CEO thinks I’m the sort of person who doesn’t have a habit of interpreting lots of things in a black-and-white and uncharitable manner, then they wouldn’t have anything to fear if they’re planning on behaving well and are at least minimally skilled at trust-building/making themselves/their motives/reasons for actions transparent.
f) You say:
“I think it is damaging to the trust people place in board members, to see them act with so little respect or honor. It reduces everyone’s faith in one another to see people in powerful positions behave badly.”
I agree that it’s damaging, but the way I see it, the problem here is the existence of psychopaths and other types of “bad actors” (or “malefactors”). They are why issues around trust and trustworthiness are sometimes so vexed and complicated. It would be wonderful if such phenotypes didn’t exist, but we have to face reality. It doesn’t actually help “the social fabric/fabric of trust” if one lends too much trust to people who abuse it to harm others and add more deception. On the contrary, it makes things worse.
g) I appreciate what you say in the first paragraph of your point IV! I feel the same way about this. (I should probably have said this earlier in my reply, but I’m about to go to sleep and so don’t want to re-alphabetize all of the points.)
I’m uncomfortable reading this comment. I believe you identified as an EA for much of your adult life, and the handle “EA” gives me lots of the bits that distinguish you from the general population. But you take for granted that Lukas’ “EA” is meant to exclude you.
By contrast, I would not have felt uncomfortable if you had been claiming adherence to the same standards by your childhood friends, startup culture, or some other clearly third party group.
Oh, but I don’t mean to say that Lukas was excluding me. I mean he was excluding all other people who exist who would also care about honoring partnerships after losing faith in the counter-party, of which there are more than just me, and more than just EAs.
OK, I guess the confusion was that it seemed like your counter-evidence to his claim was about how you would behave.
Yep, I can see how that could be confusing in context.
I think it was clear from context that Lukas’ “EAs” was intentionally meant to include Ben, and is also meant as a gentle rebuke re: naivete, not a serious claim re: honesty.
I feel like you misread Lukas, and his words weren’t particularly unclear.