Seth Herd comments on If we solve alignment, do we die anyway?

Seth Herd 25 Aug 2024 19:32 UTC
4 points
2
In your use of respect for autonomy as a goal:; are you referring to something like Empowerment is (almost) All We Need? I do find that to be an appealing alignment target (I think I’m using alignment slightly more broadly, as in Hubinger’s definition. (I have a post in progress on the terminology of different alignment/goal targets and resulting confusions).
The problem with empowerment as an ASI goal is, once again: empowering whom? And do you empower them to make more like them that you then have to empower? Roger Dearnaley notes that if we empower everyone, humans will probably lose out to either something with less volition but using fewer resources, like insects, or something with more volition to empower, like other ASIs. Do we reallly want to limit the future to baseline humans? And how do we handle humans that want to create tons more humans?
See 4. A Moral Case for Evolved-Sapience-Chauvinism and 5. Moral Value for Sentient Animals? Alas, Not Yet from Roger’s AI, Alignment, and Ethics sequence.
I actually do expect intent alignment to remain secure enough to contain AI-originating agency, as long as it’s the primary goal or “’singular target”. It’s counterintuitive that a superintelligent being could want nothing more than to do what its principal wants it to do, but I think it’s coherent. And the more competent it gets, the better it will be at doing what you want and nothing more. Before it’s that competent, the principal can give more careful instructions, including instructions to check before acting, and to help with its alignment in various ways.
I agree that respect for autonomy/empowerment is one instruction/intent you could give. I do expect that someone will turn their intent-aligned AGI into an autonomous AGI at some point; hopefully after they’re quite confident in its alignment and the worth of that goal.
- Vladimir_Nesov 27 Aug 2024 4:55 UTC
  2 points
  0
  Parent
  Respect for autonomy is not quite empowerment, it’s more like being left alone. The use of this concept is more in defining what it means for an agent or a civilization to develop relatively undisturbed, without getting overwritten by external influence, not in considering ways of helping it develop. So it’s also a building block for defining extrapolated volition, because that involves extended period of not getting destroyed by external influences. But it’s conceptually prior to extrapolated volition, it doesn’t depend on already knowing what it is, it’s a simpler notion.
  
  It’s not by itself a good singular target to set an AI to pursue, for example it doesn’t protect humans from building more extinction-worthy AIs within their membranes, and doesn’t facilitate any sort of empowerment. But it seems simple enough and agreeable as a universal norm to be a plausible aspect of many naturally developing AI goals, and it doesn’t require absence of interaction, so allows empowerment etc. if that is also something others provide.