rvnnt comments on Hierarchical Agency: A Missing Piece in AI Alignment

rvnnt 27 Nov 2024 20:07 UTC
3 points
2
A related pattern-in-reality that I’ve had on my todo-list to investigate is something like “cooperation-enforcing structures”. Things like
- legal systems, police
- immune systems (esp. in suppressing cancer)
- social norms, reputation systems, etc.
I’d been approaching this from a perspective of “how defeating Moloch can happen in general” and “how might we steer Earth to be less Moloch-fucked”; not so much AI safety directly.

Do you think a good theory of hierarchical agency would subsume those kinds of patterns-in-reality? If yes: I wonder if their inclusion could be used as a criterion/heuristic for narrowing down the search for a good theory?
- Noosphere89 27 Nov 2024 22:05 UTC
  3 points
  0
  Parent
  Most of the basis of cooperation enforcing structures, I’d argue rests on 2 general principles:
  1. An iterated game, such that there is an equilibrium for cooperation, and
  2. The ability to enforce a threat of violence if a player defects, ideally credibly, and often extends to a monopoly on violence.
  Once you have those, cooperative equilibria become possible.
  - Davidmanheim 2 Dec 2024 12:16 UTC
    2 points
    0
    Parent
    Norms can accomplish this as well—I wrote about this a couple weeks ago.
    - Noosphere89 2 Dec 2024 15:03 UTC
      2 points
      0
      Parent
      I basically agree that norms can accomplish this, conditional on the game always being iterated, and indeed conditional on countries being far-sighted enough, almost any outcome is possible, thanks to the folk theorems.