There is a “discussion” unleashing between Eliezer Yudkowsky and Greg Egan, scroll down here. It yielded this highly interesting comment by Eliezer Yudkowsky:
I don’t think the odds of us being wiped out by badly done AI are small. I think they’re easily larger than 10%. And if you can carry a qualitative argument that the probability is under, say, 1%, then that means AI is probably the wrong use of marginal resources – not because global warming is more important, of course, but because other ignored existential risks like nanotech would be more important. I am not trying to play burden-of-proof tennis. If the chances are under 1%, that’s low enough, we’ll drop the AI business from consideration until everything more realistic has been handled.
John did ask about timescales and my answer was that I had no logical way of knowing the answer to that question and was reluctant to just make one up.
[...]
As for guessing the timescales, that actually seems to me much harder than guessing the qualitative answer to the question “Will an intelligence explosion occur?”
The post “I don’t know” is about refusing to assign probability distributions at all. That’s entirely different from refusing to assign an overly focused probability distribution when your epistemic state doesn’t actually provide you enough information to do so; the latter is the technical way to say “I don’t know” when you really don’t know. In this case I do recall Eliezer saying at some point (something like) that he spends about 50% of his planning effort on scenarios where the singularity happens before 2040(?) and about 50% on scenarios where it happens after 2040, so he clearly does have a probability distribution he’s working with, it’s just that the probability mass is spread pretty broadly.
I agree that “spread out probability mass” is a good technical replacement for “I don’t know.” Note that the more spread out it is, the less concentrated it is in the near future. That is, the less confident you are betting on this particular random variable (time until human extinction), the safer you should feel from human extinction.
“50% before 2040” doesn’t sound like such a high-entropy RV to me, though...
Well, let’s start with the conditional probability if humans don’t find some other way to kill ourselves or end civilization before it comes to this. Eliezer seems to argue the following:
A. Given we survive long enough, we’ll find a way to write a self-modifying program that has, or can develop, human-level intelligence. (The capacity for self-modification follows from ‘artificial human intelligence,’ but since we’ve just seen links to writers ignoring that fact I thought I’d state it explicitly.) This necessarily gives the AI the potential for greater-than-human intelligence due to our known flaws. (I don’t know how we’d give it all of our disadvantages even if we wanted to. If we did, then someone else could and eventually would build an AI without such limits.)
B. Given A, the intelligence would improve itself to the point where we could no longer predict its actions in any detail.
C. Given B, the AI could escape from any box we put it in. (IIRC this excludes certain forms of encryption, but I see no remotely credible scenario in which we sufficiently encrypt every self-modifying AI forever.)
D. Given B and C, the AI could wipe out humanity if it ‘wanted’ to do so.
My estimate for the probability of some of these fluctuates from day to day, but I tend to give them all a high number. Claim A in particular seems almost undeniable given the evidence of our own existence. (I only listed that one separately so that people who want to argue can do so more precisely.) And when it comes to Claim E saying that if you tell a computer to kill you it will try to kill you, I don’t think the alternative has enough evidence to even consider. So I find it hard to imagine anyone rationally getting a total lower than 12% or just under 1⁄8.
Now that all applies to the conditional probability (if human technological civilization lives that long). I don’t know how to evaluate the timescale involved or the chance of us killing ourselves before the issue would come up. The latter certainly feels like less than 11⁄12.
The question would grow in importance if we found out that we needed to convince a nationally important number of people to pay attention to the issue before someone creates a theory of Friendly AI including AI goal stability. I really hope that doesn’t apply, because I suspect that if it does we’re screwed.
I’m generally reluctant to assign exact probabilities to topics like these; I consider it a sin, like giving five many significant digits on something you cannot calculate to 1 part in 10,000 precision.
We should be assigning error bars to our probabilities? A cute idea—but surely life is too short for that.
There is a “discussion” unleashing between Eliezer Yudkowsky and Greg Egan, scroll down here. It yielded this highly interesting comment by Eliezer Yudkowsky:
I don’t think the odds of us being wiped out by badly done AI are small. I think they’re easily larger than 10%. [...]
I would take the other side of a bet on that at 1000:1 odds.
The theory here seems to be that if someone believes preserving the environment is the most important thing, you can explain to them why preserving the environment is not the most important thing, and they will stop believing that preserving the environment is the most important thing. But is there any precedent for this result actually happening?
There is a “discussion” unleashing between Eliezer Yudkowsky and Greg Egan, scroll down here. It yielded this highly interesting comment by Eliezer Yudkowsky:
Finally, some quantification!
Here’s a sequence of interpretations of this passage, in decreasing order of strength:
The odds of us being wiped out by badly done AI are easily larger than 10%
The odds of us being wiped out by badly done AI are larger than or equal to 10%
There can be no compelling qualitative argument that the probability of us being wiped out by badly done AI is less that 1%
There is a compelling argument that the probability of us being wiped out by badly done AI is greater than or equal to 1%
I would be very grateful to see the weakest of these claims, number 4, supported with some calculations.
Of course I wish that there was a date attached to these claims. Easily greater than 10% chance that we’ll be wiped out in the next 50 years?
Eliezer Yudkowsky says:
[...]
We seem to have caught Yudkowsky in a moment of hypocrisy: he doesn’t know when an intelligence explosion will occur.
The post “I don’t know” is about refusing to assign probability distributions at all. That’s entirely different from refusing to assign an overly focused probability distribution when your epistemic state doesn’t actually provide you enough information to do so; the latter is the technical way to say “I don’t know” when you really don’t know. In this case I do recall Eliezer saying at some point (something like) that he spends about 50% of his planning effort on scenarios where the singularity happens before 2040(?) and about 50% on scenarios where it happens after 2040, so he clearly does have a probability distribution he’s working with, it’s just that the probability mass is spread pretty broadly.
I agree that “spread out probability mass” is a good technical replacement for “I don’t know.” Note that the more spread out it is, the less concentrated it is in the near future. That is, the less confident you are betting on this particular random variable (time until human extinction), the safer you should feel from human extinction.
“50% before 2040” doesn’t sound like such a high-entropy RV to me, though...
Well, let’s start with the conditional probability if humans don’t find some other way to kill ourselves or end civilization before it comes to this. Eliezer seems to argue the following:
A. Given we survive long enough, we’ll find a way to write a self-modifying program that has, or can develop, human-level intelligence. (The capacity for self-modification follows from ‘artificial human intelligence,’ but since we’ve just seen links to writers ignoring that fact I thought I’d state it explicitly.) This necessarily gives the AI the potential for greater-than-human intelligence due to our known flaws. (I don’t know how we’d give it all of our disadvantages even if we wanted to. If we did, then someone else could and eventually would build an AI without such limits.)
B. Given A, the intelligence would improve itself to the point where we could no longer predict its actions in any detail.
C. Given B, the AI could escape from any box we put it in. (IIRC this excludes certain forms of encryption, but I see no remotely credible scenario in which we sufficiently encrypt every self-modifying AI forever.)
D. Given B and C, the AI could wipe out humanity if it ‘wanted’ to do so.
My estimate for the probability of some of these fluctuates from day to day, but I tend to give them all a high number. Claim A in particular seems almost undeniable given the evidence of our own existence. (I only listed that one separately so that people who want to argue can do so more precisely.) And when it comes to Claim E saying that if you tell a computer to kill you it will try to kill you, I don’t think the alternative has enough evidence to even consider. So I find it hard to imagine anyone rationally getting a total lower than 12% or just under 1⁄8.
Now that all applies to the conditional probability (if human technological civilization lives that long). I don’t know how to evaluate the timescale involved or the chance of us killing ourselves before the issue would come up. The latter certainly feels like less than 11⁄12.
The question would grow in importance if we found out that we needed to convince a nationally important number of people to pay attention to the issue before someone creates a theory of Friendly AI including AI goal stability. I really hope that doesn’t apply, because I suspect that if it does we’re screwed.
From E.Y on the same page, we have this:
We should be assigning error bars to our probabilities? A cute idea—but surely life is too short for that.
I would take the other side of a bet on that at 1000:1 odds.
:-) <--
The theory here seems to be that if someone believes preserving the environment is the most important thing, you can explain to them why preserving the environment is not the most important thing, and they will stop believing that preserving the environment is the most important thing. But is there any precedent for this result actually happening?