Quickly sketching out some of my views—deliberately quite basic because I don’t typically try to generate very accurate credences for this sort of question:
When I think about the two tasks “solve alignment” and “take over the world”, I feel pretty uncertain which one will happen first. There are a bunch of considerations weighing in each direction. On balance I think the former is easier, so let’s say that conditional on one of them happening, 60% that it happens first, and 40% that the latter happens first. (This may end up depending quite sensitively on deployment decisions.)
There are also a bunch of possible misuse-related catastrophes. In general I don’t see particularly strong arguments for any of them, but AIs are just going to be very good at generating weapons etc, so I’m going to give them a 10% chance of being existentially bad.
I feel like I have something like 33% Knightian uncertainty when reasoning in this domain—i.e. that there’s a 33% chance that what ends up happening doesn’t conform very well to any of my categories.
So if we add this up we get numbers something like: P(takeover) = 24%, P(doom) = 31%, P(fine) = 36%, P(???) = 33%
I think that in about half of the takeover scenarios I’m picturing, many humans end up living pretty reasonable lives, and so the probability of everyone dying ends up more like 15-20% than 30%.
I notice that many of these numbers seem pretty similar to Paul’s. My guess is mostly that this reflects the uninformativeness of explicit credences in this domain. I usually don’t focus much on generating explicit credences because I don’t find them very useful.
Compared with Paul, I think the main difference (apart from the Knightian uncertainty bit) is that I’m more skeptical of non-takeover xrisk. I basically think the world is really robust and it’s gonna take active effort to really derail it.
I think the substance of my views can be mostly summarized as:
AI takeover is a real thing that could happen, not an exotic or implausible scenario.
By the time we build powerful AI, the world will likely be moving fast enough that a lot of stuff will happen within the next 10 years.
I think that the world is reasonably robust against extinction but not against takeover or other failures (for which there is no outer feedback loop keeping things on the rails).
I don’t think my credences add very much except as a way of quantifying that basic stance. I largely made this post to avoid confusion after quoting a few numbers on a podcast and seeing some people misinterpret them.
Quickly sketching out some of my views—deliberately quite basic because I don’t typically try to generate very accurate credences for this sort of question:
When I think about the two tasks “solve alignment” and “take over the world”, I feel pretty uncertain which one will happen first. There are a bunch of considerations weighing in each direction. On balance I think the former is easier, so let’s say that conditional on one of them happening, 60% that it happens first, and 40% that the latter happens first. (This may end up depending quite sensitively on deployment decisions.)
There are also a bunch of possible misuse-related catastrophes. In general I don’t see particularly strong arguments for any of them, but AIs are just going to be very good at generating weapons etc, so I’m going to give them a 10% chance of being existentially bad.
I feel like I have something like 33% Knightian uncertainty when reasoning in this domain—i.e. that there’s a 33% chance that what ends up happening doesn’t conform very well to any of my categories.
So if we add this up we get numbers something like: P(takeover) = 24%, P(doom) = 31%, P(fine) = 36%, P(???) = 33%
I think that in about half of the takeover scenarios I’m picturing, many humans end up living pretty reasonable lives, and so the probability of everyone dying ends up more like 15-20% than 30%.
I notice that many of these numbers seem pretty similar to Paul’s. My guess is mostly that this reflects the uninformativeness of explicit credences in this domain. I usually don’t focus much on generating explicit credences because I don’t find them very useful.
Compared with Paul, I think the main difference (apart from the Knightian uncertainty bit) is that I’m more skeptical of non-takeover xrisk. I basically think the world is really robust and it’s gonna take active effort to really derail it.
I think the substance of my views can be mostly summarized as:
AI takeover is a real thing that could happen, not an exotic or implausible scenario.
By the time we build powerful AI, the world will likely be moving fast enough that a lot of stuff will happen within the next 10 years.
I think that the world is reasonably robust against extinction but not against takeover or other failures (for which there is no outer feedback loop keeping things on the rails).
I don’t think my credences add very much except as a way of quantifying that basic stance. I largely made this post to avoid confusion after quoting a few numbers on a podcast and seeing some people misinterpret them.
Yepp, agree with all that.