Unless you can directly extract a sincere and accurate utility function from the participants’ brains, this is vulnerable to exaggeration in the AI programming. Say my optimal amount of X is 6. I could program my AI to want 12 of X, but be willing to back off to 6 in exchange for concessions regarding Y from other AIs that don’t want much X.
This does not seem to be the case when the AIs are unable to read each other’s minds. Your AI can be expected to lie to others with more tactical effectiveness than you can lie indirectly via deceiving it. Even in that case it would be better to let the AI rewrite itself for you.
On a similar note, being able to directly extract a sincere and accurate utility function from the participants’ brains leaves the system vulnerable to exploitations. Individuals are able to rewrite their own preferences strategically in much the same way that an AI can. Future-me may not be happy but present-me got what he wants and I don’t (necessarily) have to care about future me.
I had also mentioned this in an earlier comment on another thread. It turns out that this is a standard concern in bargaining theory. See section 11.2 of this review paper.
So, yeah, it’s a problem, but it has to be solved anyway in order for AIs to negotiate with each other.
Unless you can directly extract a sincere and accurate utility function from the participants’ brains, this is vulnerable to exaggeration in the AI programming. Say my optimal amount of X is 6. I could program my AI to want 12 of X, but be willing to back off to 6 in exchange for concessions regarding Y from other AIs that don’t want much X.
This does not seem to be the case when the AIs are unable to read each other’s minds. Your AI can be expected to lie to others with more tactical effectiveness than you can lie indirectly via deceiving it. Even in that case it would be better to let the AI rewrite itself for you.
On a similar note, being able to directly extract a sincere and accurate utility function from the participants’ brains leaves the system vulnerable to exploitations. Individuals are able to rewrite their own preferences strategically in much the same way that an AI can. Future-me may not be happy but present-me got what he wants and I don’t (necessarily) have to care about future me.
I had also mentioned this in an earlier comment on another thread. It turns out that this is a standard concern in bargaining theory. See section 11.2 of this review paper.
So, yeah, it’s a problem, but it has to be solved anyway in order for AIs to negotiate with each other.