I seem to remember your P(doom) being 85% a short while ago. I’d be interested to know why it has dropped to 70%, or in another way of looking at it, why you believe our odds of non-doom have doubled.
Whereas my timelines views are extremely well thought-through (relative to most people that is) I feel much more uncertain and unstable about p(doom). That said, here’s why I updated:
Hinton and Bengio have come out as worried about AGI x-risk; the FLI letter and Yudkowsky’s tour of podcasts, while incompetently executed, have been better received by the general public and elites than I expected; the big labs (especially OpenAI) have reiterated that superintelligent AGI is a thing, that it might come soon, that it might kill everyone, and that regulation is needed; internally, OpenAI at least has pushed more for focus on these big issues as well. Oh and there’s been some cool progress in interpretability & alignment which doesn’t come close to solving the problem on its own but makes me optimistic that we aren’t barking up the wrong trees / completely hitting a wall. (I’m thinking about e.g. the cheese vector and activation vector stuff and the discovering latent knowledge stuff)
As for capabilities, yes it’s bad that tons of people are now experimenting with AutoGPT and making their own LLM startups, and it’s bad that Google DeepMind is apparently doing some AGI mega-project, but… those things were already priced in, by me at least. I fully expected the other big corporations to ‘wake up’ at some point and start racing hard, and the capabilities we’ve seen so far are pretty much exactly on trend for my What 2026 Looks Like scenario which involved AI takeover in 2027 and singularity in 2028.
Basically, I feel like we are on track to rule out one of the possible bad futures (in which the big corporations circle the wagons and say AGI is Safe there is No Evidence of Danger the AI x-risk people are Crazy Fanatics and the government buys their story long enough for it to be too late.) Now unfortunately the most likely bad future remains, in which the government does implement some regulation intended to fix the problem, but it fails to fix the problem & fails to buy us any significant amount of time before the dangerous sorts of AGI are built and deployed. (e.g. because it gets watered down by tech companies averse to abandoning profitable products and lines of research, e.g. because racing with China causes everyone to go ‘well actually’ when the time comes to slow down and change course)
Meanwhile one of the good futures (in which the regulation is good and succeeds in preventing people from building the bad kinds of AGI for years, buying us time in which to do more alignment, interpretability, and governance work, and for the world to generally get more awareness and focus on the problems) is looking somewhat more likely.
So I still think we are on a default path to doom but one of the plausible bad futures seems less likely and one of the plausible good futures seems more likely. So yeah.
Thanks for this. I was just wondering how your views have updated in light of recent events.
Like you I also think that things are going better than my median prediction, but paradoxically I’ve been feeling even more pessimistic lately. Reflecting on this, I think my p(doom) has gone up instead of down, because some of the good futures where a lot of my probability mass for non-doom were concentrated have also disappeared, which seems to outweigh the especially bad futures going away and makes me overall more pessimistic.
These especially good futures were 1) AI capabilities hit a wall before getting to human level and 2) humanity handles AI risk especially competently, e.g., at this stage leading AI labs talk clearly about existential risks in their public communications and make serious efforts to avoid race dynamics, there is more competent public discussion of takeover risk than what we see today including fully cooked regulatory proposals, many people start taking less obvious (non-takeover) AI-related x-risks (like ones Paul mentions in this post) seriously.
I seem to remember your P(doom) being 85% a short while ago. I’d be interested to know why it has dropped to 70%, or in another way of looking at it, why you believe our odds of non-doom have doubled.
Whereas my timelines views are extremely well thought-through (relative to most people that is) I feel much more uncertain and unstable about p(doom). That said, here’s why I updated:
Hinton and Bengio have come out as worried about AGI x-risk; the FLI letter and Yudkowsky’s tour of podcasts, while incompetently executed, have been better received by the general public and elites than I expected; the big labs (especially OpenAI) have reiterated that superintelligent AGI is a thing, that it might come soon, that it might kill everyone, and that regulation is needed; internally, OpenAI at least has pushed more for focus on these big issues as well. Oh and there’s been some cool progress in interpretability & alignment which doesn’t come close to solving the problem on its own but makes me optimistic that we aren’t barking up the wrong trees / completely hitting a wall. (I’m thinking about e.g. the cheese vector and activation vector stuff and the discovering latent knowledge stuff)
As for capabilities, yes it’s bad that tons of people are now experimenting with AutoGPT and making their own LLM startups, and it’s bad that Google DeepMind is apparently doing some AGI mega-project, but… those things were already priced in, by me at least. I fully expected the other big corporations to ‘wake up’ at some point and start racing hard, and the capabilities we’ve seen so far are pretty much exactly on trend for my What 2026 Looks Like scenario which involved AI takeover in 2027 and singularity in 2028.
Basically, I feel like we are on track to rule out one of the possible bad futures (in which the big corporations circle the wagons and say AGI is Safe there is No Evidence of Danger the AI x-risk people are Crazy Fanatics and the government buys their story long enough for it to be too late.) Now unfortunately the most likely bad future remains, in which the government does implement some regulation intended to fix the problem, but it fails to fix the problem & fails to buy us any significant amount of time before the dangerous sorts of AGI are built and deployed. (e.g. because it gets watered down by tech companies averse to abandoning profitable products and lines of research, e.g. because racing with China causes everyone to go ‘well actually’ when the time comes to slow down and change course)
Meanwhile one of the good futures (in which the regulation is good and succeeds in preventing people from building the bad kinds of AGI for years, buying us time in which to do more alignment, interpretability, and governance work, and for the world to generally get more awareness and focus on the problems) is looking somewhat more likely.
So I still think we are on a default path to doom but one of the plausible bad futures seems less likely and one of the plausible good futures seems more likely. So yeah.
Thanks for this. I was just wondering how your views have updated in light of recent events.
Like you I also think that things are going better than my median prediction, but paradoxically I’ve been feeling even more pessimistic lately. Reflecting on this, I think my p(doom) has gone up instead of down, because some of the good futures where a lot of my probability mass for non-doom were concentrated have also disappeared, which seems to outweigh the especially bad futures going away and makes me overall more pessimistic.
These especially good futures were 1) AI capabilities hit a wall before getting to human level and 2) humanity handles AI risk especially competently, e.g., at this stage leading AI labs talk clearly about existential risks in their public communications and make serious efforts to avoid race dynamics, there is more competent public discussion of takeover risk than what we see today including fully cooked regulatory proposals, many people start taking less obvious (non-takeover) AI-related x-risks (like ones Paul mentions in this post) seriously.
Makes sense. I had basically decided by 2021 that those good futures (1) and (2) were very unlikely, so yeah.