ACrackedPot comments on How do we build organisations that want to build safe AI?

ACrackedPot 12 May 2021 16:17 UTC
−2 points
You get to write an AI, and decide how it handles its value function.
However, the value function gets to be written by a group that wins a lottery; there are 100 groups who have put in for the lottery. Two of the groups wants human extinction, one voluntarily and one involuntarily; thirty of the groups want all humans to follow their religion; seven of the groups want the world economy to reflect their preferred economic model; and most of the remaining groups want their non-religious cultural values to be enshrined for all time.
How important is it to you that there be no value drift?
- sxae 12 May 2021 16:41 UTC
  1 point
  Parent
  I’m not 100% sure if your intention is to equate democratic governance with this lottery hypothetical, but I’m not quite sure the two can be compared. As to how important or nominally good I might perceive value drift as, well I think it’s rather like how important the drift of your car is—rather dependent on the road.
  - ACrackedPot 12 May 2021 23:23 UTC
    1 point
    Parent
    No, that is not the point.
    Suppose for a moment that what you want is possible. Suppose it is possible to write values into an organization such that the organization never stops supporting those values. Is there any point in your own country’s history where you would have wanted them to use this technique to ensure that the values of that era were preserved forever more? What other groups alive today would you actually trust with the ability to do this?
    - sxae 13 May 2021 9:54 UTC
      1 point
      Parent
      But that isn’t what I want, and it’s not what I’m saying here. At no point do I make the claim that the values represented by the safety team are or should be static. I understand the point you’re making, I’ve even written about it pretty extensively here, but as far as I can see it’s a much more general ethical issue than the domain of this essay. It applies just as readily to literally any organisation as it does to the theoretical organisations proposed here.
      
      Specifically what values wider society holds and how they evolve those values is not the purview of this essay. Whatever those values are—within a reasonable domain of the possibility space—the orthogonality of the production team and those values remains. E.g if your society is single-mindedly focused on religious fervour, your societal values are still orthogonal to any good production team, so it doesn’t really affect the point I’m making all that much.