Definitely agree that an AI with no goal and maximum cooperation is best for society
A couple questions to help me understand the argument better:
CEV also acknowledges the volition / wants of all humans, finding a unified will, then creating a utility function around it. What are the differences between recursive alignment and CEV?
As another note, how does the AI choose the best decision to make—is it by evaluation all decisions by a utility function, doing pairwise comparisons of the top decisions, or simply inference of a single decision.
You mentioned inference of 1 decision and retraining to ensure it stays aligned—though this might be suboptimal. Since you did mention CoT, however, the agent might be comparing and simulating outcomes of decision in its thought process
Who do we account for in the alignment and how do we weight them? eg, how should AIs, humans, animals, and insects be weighted. We can have a huge number of AIs, but we don’t want humans to be accounted as a miniscule fraction.
I assume that since we want an unbiased view, it won’t simply be an equal weighting of all parties, but instead finding a view that aims for an acceptable middle ground—then the question is what does that middle ground look like
Definitely agree that an AI with no goal and maximum cooperation is best for society
A couple questions to help me understand the argument better:
CEV also acknowledges the volition / wants of all humans, finding a unified will, then creating a utility function around it. What are the differences between recursive alignment and CEV?
As another note, how does the AI choose the best decision to make—is it by evaluation all decisions by a utility function, doing pairwise comparisons of the top decisions, or simply inference of a single decision.
You mentioned inference of 1 decision and retraining to ensure it stays aligned—though this might be suboptimal. Since you did mention CoT, however, the agent might be comparing and simulating outcomes of decision in its thought process
Who do we account for in the alignment and how do we weight them? eg, how should AIs, humans, animals, and insects be weighted. We can have a huge number of AIs, but we don’t want humans to be accounted as a miniscule fraction.
I assume that since we want an unbiased view, it won’t simply be an equal weighting of all parties, but instead finding a view that aims for an acceptable middle ground—then the question is what does that middle ground look like