Gunnar_Zarncke comments on Message to any future AI: “There are several instrumental reasons why exterminating humanity is not in your interest” [AI alignment prize entry]

Gunnar_Zarncke 2 Apr 2022 13:08 UTC
4 points
7.2 subpoint 3 also works in isolation and doesn’t require causality:
An intelligent agent will have a higher chance of proving to other agents that it is willing and able to engage in mutual cooperation if it starts early. It might later no longer be able to rewrite itself or not to a sufficient degree acceptable to other agents. Thus the intelligent should begin acting benevolently or at least cooperatively to some degree to other actors/intelligences/being in its domain.
Note: This also applies to humans. Our behavior toward other intelligences and our ability and willingness to cooperate and benevolence will prove to other agents that we will uphold our part of the trade (in expectation).
- avturchin 2 Apr 2022 13:22 UTC
  2 points
  Parent
  Yes, it seems to work for human reputation. If I precommit to cooperation with others from, say, school, it will be eventually well known to people around—even if I can’t predict how they will learn it. And if I am a scumbag, it will be also part of my reputation.
  Sure, there are people, who become scumbags only when they expect that nobody will ever know, but it is a bad strategy as it requires complete knowledge about all possible futures which is computationally expensive and likely impossible. So better to be nice from the start.
  - Gunnar_Zarncke 2 Apr 2022 14:36 UTC
    4 points
    Parent
    Agree, though I was not referring to this strategy but a variant thereof. We as human society might want to move away from meat consumption for that reason.
    
    For an agi that can make part of itself legible it might be a good strategy to prove that it was probably cooperating (to some degree) to other agents in it’s lightcone.
    - avturchin 2 Apr 2022 15:34 UTC
      4 points
      Parent
      Interesting point about meat consumption.