I agree that we are now largely in agreement about this branch of the discussion.
Ah, I suppose this is still consistent with honesty being an equilibrium. But it would then be a really weak sort of equilibrium—there would be no reason to be honest, but no specific reason to be dishonest, either.
Yeah, I agree this is possible. (The reason to not expect dishonesty is that sometimes you’ll see honest arguments to which there is no dishonest defeater.)
Then I concede that there is an honest equilibrium where the first player tells the truth, and the second player concedes (or, in simultaneous play, both players tell the truth and then concede). However, it does seem to be an extremely weak equilibrium—the second player is equally happy to lie, starting a back-and-forth chain which is a tie in expectation.
Similar comment here—the more you expect that honest claims will likely have dishonest defeaters, the weaker you expect the equilibrium to be. (E.g. it’s clearly not a tie when honest claims never have dishonest defeaters; in this case first player always wins.)
My (admittedly conservative) supposition is that every claim does have a defeater which could be found by a sufficiently intelligent adversary, but, the difficulty of finding such claims can be much higher than finding honest ones.
But more broadly, I claim that given your assumptions there is no possible scoring rule that (in the worst case) makes honesty a unique equilibrium. This worst case is when every argument has a defeater (and in particular, every honest argument has a dishonest defeater).
In this situation, there is no possible way to distinguish between honesty and dishonesty—under your assumptions, the thing that characterizes honesty is that honest arguments (at least sometimes) don’t have defeaters.
Yep, makes sense. So nothing distinguishes between an honest equilibrium and a dishonest one, for sufficiently smart players.
There is still potentially room for guarantees/arguments about reaching honest equilibria (in the worst case) based on the training procedure, due to the idea that the honest defeaters are easier to find.
I agree that we are now largely in agreement about this branch of the discussion.
My (admittedly conservative) supposition is that every claim does have a defeater which could be found by a sufficiently intelligent adversary, but, the difficulty of finding such claims can be much higher than finding honest ones.
Yep, makes sense. So nothing distinguishes between an honest equilibrium and a dishonest one, for sufficiently smart players.
There is still potentially room for guarantees/arguments about reaching honest equilibria (in the worst case) based on the training procedure, due to the idea that the honest defeaters are easier to find.