I was previously unaware of Section 4.2 of the Scalable AI Safety via Doubly-Efficient Debate paper and, hurray, it does give an answer to (2) in Section 4.2. (Thanks for mentioning, @niplav!) That still leaves (1) unanswered, or at least not answered clearly enough, imo. Also I am curious about the extent that other people, who find debate promising, consider this paper’s answer to (2) as the answer to (2).
For what it’s worth, none of the other results that I know about were helpful for me for understanding (1) and (2). (The things I know about are the original AI Safety via Debate paper, follow-up reports by OpenAI, the single- and two-step debate papers, the Anthropic 2023 post, the Khan et al. (2024) paper. Some more LW posts, including mine.) I can of course make some guesses regarding plausible answers to (1) and (2). But most of these papers are primarily concerned with exploring the properties of debates, but not explaining where debate fits in the process of producing an AI (and what problem it aims to address).
I was previously unaware of Section 4.2 of the Scalable AI Safety via Doubly-Efficient Debate paper and, hurray, it does give an answer to (2) in Section 4.2. (Thanks for mentioning, @niplav!) That still leaves (1) unanswered, or at least not answered clearly enough, imo. Also I am curious about the extent that other people, who find debate promising, consider this paper’s answer to (2) as the answer to (2).
For what it’s worth, none of the other results that I know about were helpful for me for understanding (1) and (2). (The things I know about are the original AI Safety via Debate paper, follow-up reports by OpenAI, the single- and two-step debate papers, the Anthropic 2023 post, the Khan et al. (2024) paper. Some more LW posts, including mine.) I can of course make some guesses regarding plausible answers to (1) and (2). But most of these papers are primarily concerned with exploring the properties of debates, but not explaining where debate fits in the process of producing an AI (and what problem it aims to address).