Milan W answers why won’t this alignment plan work?

Milan W 10 Oct 2024 22:31 UTC
1 point
0
The set of all possible sequences of actions is really really really big. Even if you have an AI that is really good at assigning the correct utilities^[1] to any sequence of actions we test it with, it’s “near infinite sized”^[2] learned model of our preferences is bound to come apart at the tails or even at some weird region we forgot to check up on.
1. ^
  Good luck getting the ethicists to come to a consensus on this.
2. ^
  Von Neumman: “With four parameters I can fit an elephant, and with five I can make him wiggle his trunk”.