lukedrago comments on The Intelligence Curse

lukedrago 4 Jan 2025 17:11 UTC
7 points
0
That’s a choice, though. AGI could, for example, look like a powerful actor in its own right, with its own completely nonhuman drives and priorities, and a total disinterest in being directed in the sort of way you’d normally associate with a “resource”.
My claim is that the incentives AGI creates are quite similar to the resource curse, not that it would literally behave like a resource. But:
If by “intent alignment” you mean AGIs or ASIs taking orders from humans, and presumably specifically the humans who “own” them, or are in charge of the “powerful actors”, or form some human social elite, then it seems as though your concerns very much argue that that’s not the right kind of alignment to be going for.
My default is that powerful actors will do their best to build systems that do what they ask them to do (ie they will not pursue aligning systems with human values).
The field points towards this: alignment efforts are primarily focused on controlling systems. I don’t think this is inherently a bad thing, but it results in the incentives I’m concerned about. I’ve not seen great work on defining human values, creating a value set a system could follow, and forcing them to follow it in a way that couldn’t be overridden by its creators. Anthropic’s Constitutional AI may be a counter-example.
The incentives point towards this as well. A system that is aligned to refuse efforts that could lead resource/power/capital concentration would be difficult to sell to corporations who are likely to pursue this.
These (here, here, and here) definitions are roughly what I am describing as intent alignment.