I am trying to gain some evidence about the situation we care about (intelligent human judges, superintelligent debaters, hard questions) by looking at human debates where the questions and debaters are sufficiently simple that we have the epistemic high ground. In other words, does debate work among humans with a not that smart judge?
There are some simple and easy to understand arguments where a full and detailed understanding of why the argument is wrong is much more complicated.
For example “You can’t explain where the universe comes from without god.”
This is a short and simple argument. What do you need to understand to really see why it is wrong? You need to know the difference between the human intuitive notion of simplicity and the minimum message length formalism of Occam’s razor, how “god” sounds like a simple hypothesis, but isn’t really. You might want to talk about making beliefs pay rent in anticipated experience, privileging the hypothesis ect. Of course, these are insights it is hard to quickly impart to the typical man on the street.
In a typical street debate, the argument goes like this:
“god doesn’t explain where the universe came from either”
“yes it does, god created the universe”
“but who created god?”
“god has always existed”
“then why can’t the universe have always existed”
“your the one who thinks the universe exploded out of nothing”
There seem to be lots of short, pithy and truthy sounding statements on both sides. Debate at this level is not able to reliably find the truth, even on easy questions.
Maybe there are fewer truths than lies, so the liar is freer to optimise.
Deconstructing a misleading argument and explaining how it is mistaken is often longer and harder than the argument. In which case, whatever intelligence level of debaters, there will be lies where they can’t understand the detailed explanation of exactly why the lie is wrong, and the AI’s end up slinging superficially plausible arguments back and fourth.
Suppose you take a person with a pop sci understanding of quantum mechanics. You put them in a room with terminal connections to 2 superintelligences. In one week, the person will take a mathy quantum mechanics test. One superintelligence wants to maximise the persons score, the other wants to minimize it. The person doesn’t know which is which. I think that the person won’t do well on the test. Human understanding of quantum mechanics is a fragile thing, and is easily disrupted by maliciously chosen half truths. If half the things you think you know are correct, and half are subtley maliciously wrong and confusing, you will not be able to produce correct conclusions.
Human society already contains a lot of bullshit. Memetic evolution is fairly stupid, being an evolution it lacks any foresight. There are relatively few smart humans setting out to deliberately produce bullshit, compared to those seeking to understand the world. Bullshit that humans are smart enough to design, humans are often smart enough to reverse engineer, to spot and understand. Ie I think the balance of the game board in AI debate would be tilted towards producing bullshit compared to current human discussion, and the equilibrium with current human discussion isn’t great.
I am trying to gain some evidence about the situation we care about (intelligent human judges, superintelligent debaters, hard questions) by looking at human debates where the questions and debaters are sufficiently simple that we have the epistemic high ground. In other words, does debate work among humans with a not that smart judge?
There are some simple and easy to understand arguments where a full and detailed understanding of why the argument is wrong is much more complicated.
For example “You can’t explain where the universe comes from without god.”
This is a short and simple argument. What do you need to understand to really see why it is wrong? You need to know the difference between the human intuitive notion of simplicity and the minimum message length formalism of Occam’s razor, how “god” sounds like a simple hypothesis, but isn’t really. You might want to talk about making beliefs pay rent in anticipated experience, privileging the hypothesis ect. Of course, these are insights it is hard to quickly impart to the typical man on the street.
In a typical street debate, the argument goes like this:
“god doesn’t explain where the universe came from either”
“yes it does, god created the universe”
“but who created god?”
“god has always existed”
“then why can’t the universe have always existed”
“your the one who thinks the universe exploded out of nothing”
There seem to be lots of short, pithy and truthy sounding statements on both sides. Debate at this level is not able to reliably find the truth, even on easy questions.
Maybe there are fewer truths than lies, so the liar is freer to optimise.
Deconstructing a misleading argument and explaining how it is mistaken is often longer and harder than the argument. In which case, whatever intelligence level of debaters, there will be lies where they can’t understand the detailed explanation of exactly why the lie is wrong, and the AI’s end up slinging superficially plausible arguments back and fourth.
Suppose you take a person with a pop sci understanding of quantum mechanics. You put them in a room with terminal connections to 2 superintelligences. In one week, the person will take a mathy quantum mechanics test. One superintelligence wants to maximise the persons score, the other wants to minimize it. The person doesn’t know which is which. I think that the person won’t do well on the test. Human understanding of quantum mechanics is a fragile thing, and is easily disrupted by maliciously chosen half truths. If half the things you think you know are correct, and half are subtley maliciously wrong and confusing, you will not be able to produce correct conclusions.
Human society already contains a lot of bullshit. Memetic evolution is fairly stupid, being an evolution it lacks any foresight. There are relatively few smart humans setting out to deliberately produce bullshit, compared to those seeking to understand the world. Bullshit that humans are smart enough to design, humans are often smart enough to reverse engineer, to spot and understand. Ie I think the balance of the game board in AI debate would be tilted towards producing bullshit compared to current human discussion, and the equilibrium with current human discussion isn’t great.