John did ask about timescales and my answer was that I had no logical way of knowing the answer to that question and was reluctant to just make one up.
[...]
As for guessing the timescales, that actually seems to me much harder than guessing the qualitative answer to the question “Will an intelligence explosion occur?”
The post “I don’t know” is about refusing to assign probability distributions at all. That’s entirely different from refusing to assign an overly focused probability distribution when your epistemic state doesn’t actually provide you enough information to do so; the latter is the technical way to say “I don’t know” when you really don’t know. In this case I do recall Eliezer saying at some point (something like) that he spends about 50% of his planning effort on scenarios where the singularity happens before 2040(?) and about 50% on scenarios where it happens after 2040, so he clearly does have a probability distribution he’s working with, it’s just that the probability mass is spread pretty broadly.
I agree that “spread out probability mass” is a good technical replacement for “I don’t know.” Note that the more spread out it is, the less concentrated it is in the near future. That is, the less confident you are betting on this particular random variable (time until human extinction), the safer you should feel from human extinction.
“50% before 2040” doesn’t sound like such a high-entropy RV to me, though...
Well, let’s start with the conditional probability if humans don’t find some other way to kill ourselves or end civilization before it comes to this. Eliezer seems to argue the following:
A. Given we survive long enough, we’ll find a way to write a self-modifying program that has, or can develop, human-level intelligence. (The capacity for self-modification follows from ‘artificial human intelligence,’ but since we’ve just seen links to writers ignoring that fact I thought I’d state it explicitly.) This necessarily gives the AI the potential for greater-than-human intelligence due to our known flaws. (I don’t know how we’d give it all of our disadvantages even if we wanted to. If we did, then someone else could and eventually would build an AI without such limits.)
B. Given A, the intelligence would improve itself to the point where we could no longer predict its actions in any detail.
C. Given B, the AI could escape from any box we put it in. (IIRC this excludes certain forms of encryption, but I see no remotely credible scenario in which we sufficiently encrypt every self-modifying AI forever.)
D. Given B and C, the AI could wipe out humanity if it ‘wanted’ to do so.
My estimate for the probability of some of these fluctuates from day to day, but I tend to give them all a high number. Claim A in particular seems almost undeniable given the evidence of our own existence. (I only listed that one separately so that people who want to argue can do so more precisely.) And when it comes to Claim E saying that if you tell a computer to kill you it will try to kill you, I don’t think the alternative has enough evidence to even consider. So I find it hard to imagine anyone rationally getting a total lower than 12% or just under 1⁄8.
Now that all applies to the conditional probability (if human technological civilization lives that long). I don’t know how to evaluate the timescale involved or the chance of us killing ourselves before the issue would come up. The latter certainly feels like less than 11⁄12.
The question would grow in importance if we found out that we needed to convince a nationally important number of people to pay attention to the issue before someone creates a theory of Friendly AI including AI goal stability. I really hope that doesn’t apply, because I suspect that if it does we’re screwed.
Finally, some quantification!
Here’s a sequence of interpretations of this passage, in decreasing order of strength:
The odds of us being wiped out by badly done AI are easily larger than 10%
The odds of us being wiped out by badly done AI are larger than or equal to 10%
There can be no compelling qualitative argument that the probability of us being wiped out by badly done AI is less that 1%
There is a compelling argument that the probability of us being wiped out by badly done AI is greater than or equal to 1%
I would be very grateful to see the weakest of these claims, number 4, supported with some calculations.
Of course I wish that there was a date attached to these claims. Easily greater than 10% chance that we’ll be wiped out in the next 50 years?
Eliezer Yudkowsky says:
[...]
We seem to have caught Yudkowsky in a moment of hypocrisy: he doesn’t know when an intelligence explosion will occur.
The post “I don’t know” is about refusing to assign probability distributions at all. That’s entirely different from refusing to assign an overly focused probability distribution when your epistemic state doesn’t actually provide you enough information to do so; the latter is the technical way to say “I don’t know” when you really don’t know. In this case I do recall Eliezer saying at some point (something like) that he spends about 50% of his planning effort on scenarios where the singularity happens before 2040(?) and about 50% on scenarios where it happens after 2040, so he clearly does have a probability distribution he’s working with, it’s just that the probability mass is spread pretty broadly.
I agree that “spread out probability mass” is a good technical replacement for “I don’t know.” Note that the more spread out it is, the less concentrated it is in the near future. That is, the less confident you are betting on this particular random variable (time until human extinction), the safer you should feel from human extinction.
“50% before 2040” doesn’t sound like such a high-entropy RV to me, though...
Well, let’s start with the conditional probability if humans don’t find some other way to kill ourselves or end civilization before it comes to this. Eliezer seems to argue the following:
A. Given we survive long enough, we’ll find a way to write a self-modifying program that has, or can develop, human-level intelligence. (The capacity for self-modification follows from ‘artificial human intelligence,’ but since we’ve just seen links to writers ignoring that fact I thought I’d state it explicitly.) This necessarily gives the AI the potential for greater-than-human intelligence due to our known flaws. (I don’t know how we’d give it all of our disadvantages even if we wanted to. If we did, then someone else could and eventually would build an AI without such limits.)
B. Given A, the intelligence would improve itself to the point where we could no longer predict its actions in any detail.
C. Given B, the AI could escape from any box we put it in. (IIRC this excludes certain forms of encryption, but I see no remotely credible scenario in which we sufficiently encrypt every self-modifying AI forever.)
D. Given B and C, the AI could wipe out humanity if it ‘wanted’ to do so.
My estimate for the probability of some of these fluctuates from day to day, but I tend to give them all a high number. Claim A in particular seems almost undeniable given the evidence of our own existence. (I only listed that one separately so that people who want to argue can do so more precisely.) And when it comes to Claim E saying that if you tell a computer to kill you it will try to kill you, I don’t think the alternative has enough evidence to even consider. So I find it hard to imagine anyone rationally getting a total lower than 12% or just under 1⁄8.
Now that all applies to the conditional probability (if human technological civilization lives that long). I don’t know how to evaluate the timescale involved or the chance of us killing ourselves before the issue would come up. The latter certainly feels like less than 11⁄12.
The question would grow in importance if we found out that we needed to convince a nationally important number of people to pay attention to the issue before someone creates a theory of Friendly AI including AI goal stability. I really hope that doesn’t apply, because I suspect that if it does we’re screwed.