I think there are a couple of interesting elements here.
Acknowledging that the individual AI representatives will act on preferences/values, there will be many situations where the optimal move is not what an individual person believes should be done.
Take a simple example. A large part (basically all?) of the US population wants cheap housing to be available, and for elite housing to be built in a value maximizing way (aka the elite want to get their money’s worth). Yet a common preference is “no new housing built near me, where the noise/traffic/sight will affect me. And “building new luxury housing won’t lower the market price for housing because demand is infinite”. “Also I don’t like seeing homeless people ”.
What a person claims to want is opposed to how they want the government to act.
This also will make it difficult to audit ones AI representative. Decisions will become extremely complex negotiations.
If a single person’s only voice is a vote, then for most issues the preferences of most voters don’t matter. They can be ignored on the margin. This is because current democracy “bundles” decisions. Perhaps you had in mind a direct democracy where a person’s ai representative votes on every decision.
If you can separate the how from the what, I wonder what people actually disagree on. An enormous amount of political conflicts seem to be disputes over the how, where people cannot agree on what policy has the highest probability of achieving a goal.
This is essentially just human ignorance: given a common data set about the world, you cannot agree to disagree: there is exactly one optimal policy using the rational policy that at that instant in time which has the highest EV (measured by back testing etc)
Very good points.
Yes, I was imagining that this would enable a direct democracy style system, with less dependence on elected representatives and less bundling of issues.
I was also imagining that it could be tested to see what the theoretical outcomes would have been. And tried out on small politically polarized groups.
The difficulty of auditing is a tricky one. But since people will have control over their own agent, they can instruct their agent to be more blunt and less strategic if they want.
I think separating the how from the what is tricky. I think futarchy is one of the few proposals I’ve heard to potentially help with this. I think having a congress of AI agents all based on the same LLM, differing only in the complex prompt they have been given, at least reduces the problem of intelligence differentials. I imagine the prompt is generated by an automated process of the user answering a long series of questions about their values. Then users can opt to add additional specifications such as the directive to be blunt for easier auditing. But only via a process of dialogue with the agent and automatic summarization of the dialogue, so that it would be harder to do weird prompt engineering stuff.
I do think that even once you’ve gotten past the problem of how to focus on the what, you will find at least some remaining disagreements. Different fundamental values between different people.
I think there are a couple of interesting elements here.
Acknowledging that the individual AI representatives will act on preferences/values, there will be many situations where the optimal move is not what an individual person believes should be done.
Take a simple example. A large part (basically all?) of the US population wants cheap housing to be available, and for elite housing to be built in a value maximizing way (aka the elite want to get their money’s worth). Yet a common preference is “no new housing built near me, where the noise/traffic/sight will affect me. And “building new luxury housing won’t lower the market price for housing because demand is infinite”. “Also I don’t like seeing homeless people ”.
What a person claims to want is opposed to how they want the government to act.
This also will make it difficult to audit ones AI representative. Decisions will become extremely complex negotiations.
If a single person’s only voice is a vote, then for most issues the preferences of most voters don’t matter. They can be ignored on the margin. This is because current democracy “bundles” decisions. Perhaps you had in mind a direct democracy where a person’s ai representative votes on every decision.
If you can separate the how from the what, I wonder what people actually disagree on. An enormous amount of political conflicts seem to be disputes over the how, where people cannot agree on what policy has the highest probability of achieving a goal.
This is essentially just human ignorance: given a common data set about the world, you cannot agree to disagree: there is exactly one optimal policy using the rational policy that at that instant in time which has the highest EV (measured by back testing etc)
Very good points. Yes, I was imagining that this would enable a direct democracy style system, with less dependence on elected representatives and less bundling of issues. I was also imagining that it could be tested to see what the theoretical outcomes would have been. And tried out on small politically polarized groups.
The difficulty of auditing is a tricky one. But since people will have control over their own agent, they can instruct their agent to be more blunt and less strategic if they want.
I think separating the how from the what is tricky. I think futarchy is one of the few proposals I’ve heard to potentially help with this. I think having a congress of AI agents all based on the same LLM, differing only in the complex prompt they have been given, at least reduces the problem of intelligence differentials. I imagine the prompt is generated by an automated process of the user answering a long series of questions about their values. Then users can opt to add additional specifications such as the directive to be blunt for easier auditing. But only via a process of dialogue with the agent and automatic summarization of the dialogue, so that it would be harder to do weird prompt engineering stuff.
I do think that even once you’ve gotten past the problem of how to focus on the what, you will find at least some remaining disagreements. Different fundamental values between different people.