What about the toy version of the alignment via debate problem, where two human experts try to convince a human layman about a complex issue they lack the biological capability to fully understand (e.g. 90 IQ layman and the Poincare Conjecture)? Have experiments been run on this? I just don’t see how someone who can’t “get” calculus after many years of trying can separate good and bad arguments in a field far beyond their ability to understand.
What about the toy version of the alignment via debate problem, where two human experts try to convince a human layman about a complex issue they lack the biological capability to fully understand (e.g. 90 IQ layman and the Poincare Conjecture)? Have experiments been run on this? I just don’t see how someone who can’t “get” calculus after many years of trying can separate good and bad arguments in a field far beyond their ability to understand.
You might look here for more info: https://www.alignmentforum.org/posts/PJLABqQ962hZEqhdB/debate-update-obfuscated-arguments-problem