Benjy_Forstadt comments on Another argument against maximizer-centric alignment paradigms

Benjy_Forstadt 22 Sep 2024 23:21 UTC
13 points
4
There is a difference between the claim that powerful agents are approximately well-described as being expected utility maximizers (which may or may not be true) and the claim that AGI systems will have an explicit utility function the moment they’re turned on, and maximize that function from that moment on.

I think this is the assumption OP is pointing out: “most of the book’s discussion of AI risk frames the AI as having a certain set of goals from the moment it’s turned on, and ruthlessly pursuing those to the best of its ability”. “From the moment it’s turned on” is pretty important, because it rules out value learning as a solution