Suppose you read a convincing-seeming argument by Karl Marx, and get swept up in the beauty of the rhetoric and clarity of the exposition. Or maybe a creationist argument carries you away with its elegance and power. Or maybe you’ve read Eliezer’s take on AI risk, and, again, it seems pretty convincing.
How could you know if these arguments are sound? Ok, you could whack the creationist argument with the scientific method, and Karl Marx with the verdict of history, but what would you do if neither was available (as they aren’t available when currently assessing the AI risk argument)? Even if you’re pretty smart, there’s no guarantee that you haven’t missed a subtle logical flaw, a dubious premise or two, or haven’t got caught up in the rhetoric.
One thing should make you believe the argument more strongly: and that’s if the argument has been repeatedly criticised, and the criticisms have failed to puncture it. Unless you have the time to become an expert yourself, this is the best way to evaluate arguments where evidence isn’t available or conclusive. After all, opposite experts presumably know the subject intimately, and are motivated to identify and illuminate the argument’s weaknesses.
If counter-arguments seem incisive, pointing out serious flaws, or if the main argument is being continually patched to defend it against criticisms—well, this is strong evidence that main argument is flawed. Conversely, if the counter-arguments continually fail, then this is good evidence that the main argument is sound. Not logical evidence—a failure to find a disproof doesn’t establish a proposition—but good Bayesian evidence.
In fact, the failure of counter-arguments is much stronger evidence than whatever is in the argument itself. If you can’t find a flaw, that just means you can’t find a flaw. If counter-arguments fail, that means many smart and knowledgeable people have thought deeply about the argument—and haven’t found a flaw.
And as far as I can tell, critics have constantly failed to counter the AI risk argument. To pick just one example, Holden recently provided a cogent critique of the value of MIRI’s focus on AI risk reduction. Eliezer wrote a response to it (I wrote one as well). The core of Eliezer’s and my response wasn’t anything new; they were mainly a rehash of what had been said before, with a different emphasis.
And most responses to critics of the AI risk argument take this form. Thinking for a short while, one can rephrase essentially the same argument, with a change in emphasis to take down the criticism. After a few examples, it becomes quite easy, a kind of paint-by-numbers process of showing that the ideas the critic has assumed, do not actually make the AI safe.
You may not agree with my assessment of the critiques, but if you do, then you should adjust your belief in AI risk upwards. There’s a kind of “conservation of expected evidence” here: if the critiques had succeeded, you’d have reduced the probability of AI risk, so their failure must push you in the opposite direction.
In my opinion, the strength of the AI risk argument derives 30% from the actual argument, and 70% from the failure of counter-arguments. This would be higher, but we haven’t yet seen the most prominent people in the AI community take a really good swing at it.
The failure of counter-arguments argument
Suppose you read a convincing-seeming argument by Karl Marx, and get swept up in the beauty of the rhetoric and clarity of the exposition. Or maybe a creationist argument carries you away with its elegance and power. Or maybe you’ve read Eliezer’s take on AI risk, and, again, it seems pretty convincing.
How could you know if these arguments are sound? Ok, you could whack the creationist argument with the scientific method, and Karl Marx with the verdict of history, but what would you do if neither was available (as they aren’t available when currently assessing the AI risk argument)? Even if you’re pretty smart, there’s no guarantee that you haven’t missed a subtle logical flaw, a dubious premise or two, or haven’t got caught up in the rhetoric.
One thing should make you believe the argument more strongly: and that’s if the argument has been repeatedly criticised, and the criticisms have failed to puncture it. Unless you have the time to become an expert yourself, this is the best way to evaluate arguments where evidence isn’t available or conclusive. After all, opposite experts presumably know the subject intimately, and are motivated to identify and illuminate the argument’s weaknesses.
If counter-arguments seem incisive, pointing out serious flaws, or if the main argument is being continually patched to defend it against criticisms—well, this is strong evidence that main argument is flawed. Conversely, if the counter-arguments continually fail, then this is good evidence that the main argument is sound. Not logical evidence—a failure to find a disproof doesn’t establish a proposition—but good Bayesian evidence.
In fact, the failure of counter-arguments is much stronger evidence than whatever is in the argument itself. If you can’t find a flaw, that just means you can’t find a flaw. If counter-arguments fail, that means many smart and knowledgeable people have thought deeply about the argument—and haven’t found a flaw.
And as far as I can tell, critics have constantly failed to counter the AI risk argument. To pick just one example, Holden recently provided a cogent critique of the value of MIRI’s focus on AI risk reduction. Eliezer wrote a response to it (I wrote one as well). The core of Eliezer’s and my response wasn’t anything new; they were mainly a rehash of what had been said before, with a different emphasis.
And most responses to critics of the AI risk argument take this form. Thinking for a short while, one can rephrase essentially the same argument, with a change in emphasis to take down the criticism. After a few examples, it becomes quite easy, a kind of paint-by-numbers process of showing that the ideas the critic has assumed, do not actually make the AI safe.
You may not agree with my assessment of the critiques, but if you do, then you should adjust your belief in AI risk upwards. There’s a kind of “conservation of expected evidence” here: if the critiques had succeeded, you’d have reduced the probability of AI risk, so their failure must push you in the opposite direction.
In my opinion, the strength of the AI risk argument derives 30% from the actual argument, and 70% from the failure of counter-arguments. This would be higher, but we haven’t yet seen the most prominent people in the AI community take a really good swing at it.