I was thinking specifically here of maximizing the value function (desires) across the agents interacting with other. Or more specially adapting the system in a way that it self maintains “maximizing the value function (desires) across the agents” property.
An example is an ecomonic system which seeks to maximize the total wealthfare. Current systems though don’t maintain themselves. More powerful agents take over the control mechanisms (or adjust the market rules) so that they are favoured (lobbying, cheating, ignoring the rules, mitageting enforcement). Similar problems occur in other types of coallitions.
Postulating a more powerful agent that forces this maximization property (an aligned super AGI) is cheating unless you can describe how this agent works and self maintains itself and this goal.
However coming to a solution of a system of agents that self maintains this property with no “super agent” might lead to solutions for AGI alignment, or might prevent the creation of such a misaligned agent.
I read a while ago the design/theoritics of corruption resistent systems is an area that has not received much research.
However coming to a solution of a system of agents that self maintains this property with no “super agent” might lead to solutions for AGI alignment, or might prevent the creation of such a misaligned agent.
I doubt that because intelligence explosions or their leadups make things local.
I was thinking specifically here of maximizing the value function (desires) across the agents interacting with other. Or more specially adapting the system in a way that it self maintains “maximizing the value function (desires) across the agents” property.
An example is an ecomonic system which seeks to maximize the total wealthfare. Current systems though don’t maintain themselves. More powerful agents take over the control mechanisms (or adjust the market rules) so that they are favoured (lobbying, cheating, ignoring the rules, mitageting enforcement). Similar problems occur in other types of coallitions.
Postulating a more powerful agent that forces this maximization property (an aligned super AGI) is cheating unless you can describe how this agent works and self maintains itself and this goal.
However coming to a solution of a system of agents that self maintains this property with no “super agent” might lead to solutions for AGI alignment, or might prevent the creation of such a misaligned agent.
I read a while ago the design/theoritics of corruption resistent systems is an area that has not received much research.
I doubt that because intelligence explosions or their leadups make things local.