Eli Tyre comments on Dialogue on the Claim: “OpenAI’s Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI”

Eli Tyre 23 Nov 2023 3:54 UTC
5 points
0
This is a serious question, playfully asked.

Ben, in private conversation recently, you said that you were against sacredness, because it amounted to deciding that there’s something about which you’ll ignore tradeoffs (feel free to correct my compression of you here, if you don’t endorse it).
So here’s a tradeoff: what percentage probability of successfully removing Sam Altman would you trade for doing so honorably. How much does behaving honorably have to cause the success probability to fall before it’s not worth it?

Or is this a place where you would say, “no, I defy the tradeoff. I’ll take my honor over any differences in success probability.” (In which case, I would put forth, you regard honor as sacred.)

: P
- Ben Pace 23 Nov 2023 19:44 UTC
  8 points
  2
  Parent
  This question seems like a difficult interaction between utilitarianism and virtue ethics...
  I think not being honorable is in large part a question of strategy. If you don’t honor implicit agreements on the few occasions when you really need to win, it’s a pretty different strategy from whether you honor implicit agreements all of the time. So it’s not just a local question to a single decision, it’s a broader strategic question.
  I am sympathetic to consequentialist evaluations of strategies. I am broadly like “If you honor implicit agreements then people will be much more willing to trade with you and give you major responsibilities, and so going back on them in one occasion generally strikes down a lot of ways you might be able to effect the world.” It’s not just about this decision, but about an overall comparison of the costs and benefits of different kinds of strategies. There are many strategies one can play.
  I could potentially make up some fake numbers to give a sense of how different decisions change which strategies to run (e.g. people who play more to the letter than the spirit of agreements, people who will always act selfishly if the payoff is at least going to 2x their wealth, people who care about their counter-parties ending up okay, people who don’t give a damn about their counter-parties ending up okay, etc). I roughly think much more honest and open, straightforward, pro-social, and simple strategies are widely trusted, better for keeping you and your allies sane, are more effective on the particular issues you care about, but less effective at getting generic un-scoped power. I don’t much care about the latter relative to the first three so it seems to me way better at achieving my goals.
  I think it’s extremely costly for trust to entirely change strategies during a single high-stakes decision, so I don’t think it makes sense to re-evaluate it during the middle of the decision based on a simple threshold. (There could be observations that would make me realize during a high-stakes thing that I had been extremely confused about what game we were even playing, and then I’d change, but that doesn’t fit as an answer to your question, which is about a simple probability/utility tradeoff.) It’s possible that your actions on a board like this are overwhelmingly the most important choices you’ll make and should determine your overall strategy, and you should really think through that ahead of time and let your actions show what strategy you’re playing — well before agreeing to be on such a board.
  Hopefully that explained how I think about the tradeoff you asked, while not giving specific numbers. I’m willing to answer more on this.
  (Also, a minor correction: I said I was considering broadly dis-endorsing the sacred, for that reason. It seems attractive to me as an orientation to the world but I’m pretty sure I didn’t say this was my resolute position.)
  - Eli Tyre 24 Nov 2023 3:36 UTC
    2 points
    1
    Parent
    If you don’t honor implicit agreements on the few occasions when you really need to win, it’s a pretty different strategy from whether you honor implicit agreements all of the time.
    Right. I strongly agree.
    
    I think this is the at least part of the core bit of law that underlies sacredness. Having some things that are sacred to you is a way to implement this kind of reliable-across-all-worlds kind of policy, even when the local incentives might tempt you to violate that policy for local benefit.
    - Ben Pace 24 Nov 2023 4:47 UTC
      3 points
      2
      Parent
      Eh, I prefer to understand why the rules exist rather than blindly commit to them. Similarly, the Naskapi hunters used divination as a method of ensuring they’d randomize their hunting spots, and I think it’s better to understand why you’re doing it, rather than doing it because you falsely believe divination actually works.
      - gwern 27 Nov 2023 3:00 UTC
        12 points
        3
        Parent
        Minor point: the Naskapi hunters didn’t actually do that. That was speculation which was never verified, runs counter to a lot of facts, and in fact, may not have been about aboriginal hunters at all but actually inspired by the author’s then-highly-classified experiences in submarine warfare in WWII in the Battle of the Atlantic. (If you ever thought to yourself, ‘wow, that Eskimo story sounds like an amazingly clear example of mixed-strategies from game theory’...) See some anthropologist criticism & my commentary on the WWII part at https://gwern.net/doc/sociology/index#vollweiler-sanchez-1983-section
      - Eli Tyre 24 Nov 2023 5:52 UTC
        2 points
        0
        Parent
        I certainly don’t disagree with understanding the structure of good strategies!
- johnswentworth 23 Nov 2023 4:56 UTC
  8 points
  4
  Parent
  (One could reasonably say that the upside of removing Sam Altman from OpenAI is not high enough to be worth dishonor over the matter, in which case percent chance of success doesn’t matter that much.
  Indeed, success probabilities in practice only range over 1-2 orders of magnitude before they stop being worth tracking at all, so probably the value one assigns to removing Sam Altman at all dominates the whole question, and success probabilities just aren’t that relevant.)
  - Eli Tyre 23 Nov 2023 18:17 UTC
    2 points
    0
    Parent
    Ok. Then you can least-convenient-possible-world the question. What’s the version of the situation where removing the guy is important enough that the success probabilities start mattering in the calculus?
    
    Also to be clear, I think my answer to this is “It’s just fine for some things to be sacred. Especially for features like honor or honesty, in which their strength comes from them being reliable under all circumstances (or at least that they have strength in proportion to the circumstances in which they hold).
    - Ruby 23 Nov 2023 21:09 UTC
      2 points
      0
      Parent
      My intuition (not rigorous) is there a multiple levels in the consequentialist/deontoligical/consequentialist dealio.
      
      I believe that unconditional friendship is approximately something one can enter into, but one enters into it for contingent reasons (perhaps in a Newcomb-like way – I’ll unconditionally be your friend because I’m betting that you’ll unconditionally be my friend). Your ability to credibly enter such relationships (at least in my conception of them) is dependent on you not starting to be more “conditional” because you doubt that the other person is also being there. This I think is related to not being a “fair weather” friend. I continue to be your friend even when it’s not fun (you’re sick, need taking care of whatever) even if I wouldn’t have become your friend to do that. And vice versa. Kind of a mutual insurance policy.
      
      Same thing could be with contracts, agreements, and other collaborations. In a Newcomb-like way, I commit to being honest, being cooperative, etc to a very high degree even in the face of doubts about you. (Maybe you stop by the time someone is threatening your family, not sure what Ben, et al, think about that.) But the fact I entered into this commitment was based on the probabilities I assigned to your behavior at the start.