At some point in the discussion, he said “let’s assume the question of defining human values is solved”
What did he need that assumption for? If the setting is that two AIs try to convince a human audience, the assumption doesn’t seem to enter into it. The important question is presumably what friendliness properties we can deduce about any AI that wins the debate against every other AI.
What did he need that assumption for? If the setting is that two AIs try to convince a human audience, the assumption doesn’t seem to enter into it. The important question is presumably what friendliness properties we can deduce about any AI that wins the debate against every other AI.