Akbir Khan comments on Debating with More Persuasive LLMs Leads to More Truthful Answers

Akbir Khan 9 Feb 2024 13:13 UTC
6 points
3
I have to disagree; BoN is a really good approximation of what happens under RL-finetuning (which is the natural learning method for multi-turn debate).
I do worry “persuasiveness” is the incorrect word, but it seems to be a reasonable interpretation when comparing debaters A and B. E.g. for a given question and set of answers, if A wins independent of the answer assignment (e.g no matter what answer it has to defend) it is more persuasive then B.