Are extreme probabilities for P(doom) epistemically justifed?
Can you post the superforecaster report that has the 0.12% P(Doom) number. I have not actually read anything of course and might be talking out of my behind.
In any case, there have been several cases where OpenPhil or somebody or other has brought in ‘experts’ of various ilk to debate the P(Doom), probability of existential risk. [usually in the context of AI]
Many of these experts give very low percentages. One percentage I remember was 0.12 %
In the latest case these were Superforecasters, Tetlock’s anointed. Having ‘skin in the game’ they outperformed the fakexperts in various prediction markets.
So we should defer to them (partially) on the big questions of x-risk also. Since they give very low percentages that is good. So the argument goes.
Alex thinks these percentages are ridicilously, unseriously low. I would even go as far as saying that the superforecasters aren’t really answering the actual question when they give these percentages for x-risk (both AI and general x-risk)
Yeah—there are two relevant reports. This is the first one - which has estimates for a range of x-risks including from AI where superforcasters reports a mean estimate of AI causing extinction of 0.4%, and this one—which has the 0.12% figure—which specifically selected for forcasters who were sceptical of AI extinction and experts who were concerned, with the purpose of finding cruxes between the two groups.
I think that we should defer quite a lot to superforcasters, and I don’t take these percentages as being unseriously low.
I also think it’s important to clarify that the 0.4% number is human extinction in particular, rather than something broader like loss of control of the future, over 1bn people dying, outcomes as bad as extinction etc.
One issue with deferring to superforecasters on x-risk percentages is that a true x-risk is a prediction question that never gets resolved.
The optimal strategy when playing a prediction market on a question that doesn’t get resolved is giving 0% probability
I think I’m basically not worried about that—I predict that the superforcaters in the report took the task seriously and forcasted in the same way they would other questions. It’s also not been made public who the forcasters were, there’s no money on the line or even internet points.
Those are some good points.
We’re getting to another issue. Why is it appropriate to defer to a forecaster that does well on short-horizon prediction, in fairly-well understood domains on a question that has literally never happened (this includes both x-risks and catastrophic risks).
I would even say these are anti-correlated. By ignoring black swans you will do a little better on the time horizons in which black swans haven’t happened yet (‘picking up pennies in front of a steamroller’)
Yep that’s reasonable and I wouldn’t advocate for deferring entirely to superforcasters—I think the appropriate strategy is using a number of different sources to predict both the probability of human extinction from AI and AI timelines.
Superforcasters though have amongst the most clearly transferable skills to predicting rare events because they’re good at predicting near term events. It’s plausible that ignoring black swans makes you do a bit better on short time horizons and superforcasters are using this as a herustic.
If this effect was sufficiently strong that we shouldn’t defer to superforcasters on questions over long time horizons, that would imply that people with lots of practice forcasting would do worse on forcasting long timelines questions than people who don’t and that seems basically implausible to me.
So I think the realively weak performance of forcasters on long term predictions means that we should defer less to them on these of questions, while still giving their views some weight. I think the possibility of the black-swan heuristics mean that that defference should go down a bit further, but by a small degree.
I find it highly suspicious that superforecasters are giving low percentages for all types of x-risks.
Obviously as an AI x-risk person I am quite biased to thinking AI x-risk is a big deal and P(doom|agi) is substantial. Now perhaps I am the victim of peer pressure, crowd-thinking, a dangerous cult etc etc. Perhaps I am missing something essential.
But for other risks superforecasters are giving extreemely low percentages as well.
To be sure, one sometimes feels that in certain circles (‘doomer’) there is a race to be as doomy as possible. There is some social desirability bias here, maybe some other complicated Hansonian signalling reasons. We have also seen this in various groups concerned about nuclear risks, environmental risks etc etc.
But it’s so low (0.4 %, 0.12 %, whatever) I am wondering how the obtain so much confidence about these kinds of never-before-seen events.
Are they starting with a very low prior on extinction? Or are they updating on something?
These kinds of all-things-considered percentages are really low. 0.1-1 % is getting at the epsilon threshold of being able to trust social reasoning.
It is about my credence in really faroff stuff like aliens, conspiracy theories, I am secretly mentally ill, whacky stuff for which I would normally say ‘I dont believe this’.
just to clarify: I obviously think in many cases very low (or high) percentages are real and valid epistemic state.
But I see these as conditional on ‘my reasoning process is good, I haven’t massively missed some factor, I am not crazy’.
Feels quite hard to have all-things considered percentages on 1-off events that maybe be decades in the future.
I don’t find it suspicious that the other probabilities for extinction from other causes is also really low—we do have some empirical evidence from natural pandemics and rates of death from terrorists and wars for biowepon-mediated extinction events. We have the lack of any nuclear weapons being launched for nuclear risks, and really quite a lot of empircal evidence for climate stuff.
We also have the implied rate of time-discounting from financial instruments that expirance high rates of trading and pose very few other risks like US and Swiss bonds, which also imply low risks of everyone dying—for instance there have been recent periods where bonds have been trading as negative real rates.
I also think that supercastors are probably much much better than other people at taking qualitative considerations and intuitions and translating those into well-calibrated probabilities.
I really like Rafael Harth comment on Yudkowsky’s recent empiricism as anti-epistomology.
“It is not the case that an observation of things happening in the past automatically translates into a high probability of them continuing to happen. Solomonoff Induction actually operates over possible programs that generate our observation set (and in extension, the observable universe), and it may or not may not be the case that the simplest universe is such that any given trend persists into the future. There are no also easy rules that tell you when this happens; you just have to do the hard work of comparing world models.”
And perhaps this is a deeper crux between you and me that underlies here.
I am quite suspicious of linearly extrapolating various trendlines.
To me the pieces of evidence you name—while very interesting! - are fundamentally limited in what they can say about the future.
Stated in Bayesian terms—if we consider the history of the earth as a stochastic process then it is highly non-IID, so correlations in one time-period are of limited informativeness about the future.
I also feel this kinds of linear extrapolation would have done really bad in many historical cases.
I agree that the time series of observations of history of the earth are deeply non IID—but shouldn’t this make us more willing to extrapolate trends because we aren’t facing time series composed of noise but instead time series where we can condition on the previous substantiation of that time series. E.g we could imagine the time series as some process with an autoregressive component meaning that there should see persistence from past events.
(this comment isn’t very precise, but probably could be made more precise with some work)
Why would these linear (in the generalised linear model sense) have done badly in the past?
The kind of superficial linear extrapolation of trendlines can be powerful, perhaps more powerful than usually accepted in many political/social/futurist discussions. In many cases, succesful forecasters by betting on some high level trend lines often outpredict ‘experts’.
But it’s a very non-gears level model. I think one should be very careful about using this kind of reasoning when for tail-events.
e.g. this kind of reasoning could lead one to reject development of nuclear weapons.
I think mechanistic stories that are gears-level about the future can give lower bounds on tail events that are more reliable than linear trend extrapolation.
e.g. I see a clear ‘mechanistic’ path to catastrophic (or even extinction) risk from human-engineered plagues in the next 100 years. The technical details to human-engineerd plagues are being suppressed but afaic it’s either possible to make engineered plagues that are many many times more invectious, deadly, and kill by delay, difficult to track etc or it will be possible soon.
scenario: Some terrorist group, weird dictator, great power conflict makes a monstrous weapon, an engineered virus that is spreads like the measles or covid, but kills >10% after a long incubation period. We’ve seen how powerless the world governments were in containing covid. It doesn’t seem enough lessons were learned or have been learned since then.
I can’t imagine any realistic evidence based on market interest rates or past records of terrorist deaths or anything that economists would like would ever convince me that this is not a realistic (>1-5%) event.
Linear extrapolation of chemical explosives yield would have predicted nuclear weapons are totally out of distribution.
But in fact, just looking at past data just simply isn’t a substitute for knowing the secrets of the universe.
I think the crux here might be how we should convert these qualitative considerations into numerical probabilities, and basically, my take is that superforcasters have a really good track record of doing this well, and the average person is really bad at doing this (e.g the average American thinks like 30% of the population is Jewish, these sorts of things.)
On the chemical explosives one, AI impacts have maybe 35 of these case studies on weather are breakpoints in technological development and I think explosive power is the only one where they found a break that trend extrapolition wouldn’t have predicted
I am aware of AI impacts research and I like it.
I think what it suggest is that trend-breaks are rare.
1⁄35 if you will.
(of course, one can get in some reference class tennis here. Homo sapiens are also a trend break compared to primates, animals. Is that a technology? I don’t know. It’s very vulnerable to refernces class and low N examples)
Fwiw the average probability given that AI kills 10%+ of the population was 2.13% in the general x-risk forcasting report, which isn’t very different from 1⁄35
I’m not sure where it’s useful to go from here. I think maybe the takeaway is that our crux is how to convert qualitative considerations combined with base rates stuff into final probabilities, and I’m much more willing to defer to superforcasters on this than you are?
I feel this is a good place to end. Thank you for your time and effort !
I would summarize my position as:
- I am less impressed by superforecaster track record than you are. [ we didn’t get into this]
- I feel linear trend extrapolation is limited in saying much about tail risk.
- I think good short-horizon predictors will predictabily underestimate black swans
- I think there is a large irreducible uncertainty about the future (and the world in general) that makes very low or very high percentages not epistemically justified.
If I were epistemically empathetic I would be able to summarize your position. I am not.
But if I would try I would say you are generally optimistic about forecasting, past data and empirics.
The Metaculus community strikes me as a better starting point for evaluating how different the safety inside view is from a forecasting/outside view. The case for deferring to superforecasters is the same the case for deferring to the Metaculus community—their track record. What’s more, the most relevant comparison I know of scores Metaculus higher on AI predictions. Metaculus as a whole is not self-consistent on AI and extinction forecasting across individual questions (links below). However, I think it is fair to say that Metaculus as a whole has significantly faster timelines and P(doom) compared to superforecasters.
If we compare the distribution of safety researchers’ forecasts to Metaculus (maybe we have to set aside MIRI...), I don’t think there will be that much disagreement. I think remaining disagreement will often be that safety researchers aren’t being careful about how the letter and the spirit of the question can come apart and result in false negatives. In the one section of the FRI studies linked above I took a careful look at, the ARA section, I found that there was still huge ambiguity in how the question is operationalized—this could explain up to an OOM of disagreement in probabilities.
Some Metaculus links: https://www.metaculus.com/questions/578/human-extinction-by-2100/ Admittedly in this question the number is 1%, but compare to the below. Also note that the forecasts date back to as old as 2018. https://www.metaculus.com/questions/17735/conditional-human-extinction-by-2100/ https://www.metaculus.com/questions/9062/time-from-weak-agi-to-superintelligence/ (compare this to the weak AGI timeline and other questions)
I’m interested in hearing your thoughts on this.
Agree. In some sense you have to invent all the technology before the stochastic process of technological development looks predictable to you, almost by definition. I’m not sure it is reasonable to ask general “forecasters” about questions that hinge on specific technological change. They’re not oracles.
Suggested spelling corrections:
I predict that the superforcasters in the report took
a lot of empirical evidence for climate stuff
and it may or may not be the case
There are also no easy rules that
meaning that we should see persistence from past events
I also feel these kinds of linear extrapolation
and really quite a lot of empirical evidence
are many many times more infectious
engineered virus that spreads like the measles or covid
case studies on weather there are breakpoints in technological development
break that trend extrapolation wouldn’t have predicted
It’s very vulnerable to references class and
impressed by superforecaster track records than you are.
My expectation is that superforcasters weren’t able to look into detailed arguments that represent the x-risk well and they would update after learning more.
I think this proves too much—this would predict that superforecasters would be consistently outperformed by domain experts when typically the reverse it true.
I think I agree.
For my information, what’s your favorite reference for superforecasters outperforming domain experts?
As of two years ago, the evidence for this was sparse. Looked like parity overall, though the pool of “supers” has improved over the last decade as more people got sampled.
There are other reasons to be down on XPT in particular.
My expectation is that superforcasters weren’t able to look into detailed arguments that represent the x-risk well and they would update after learning more.
I quite enjoyed this conversation, but imo the x-risk side needs to sit down to make a more convincing, forecasting-style prediction to meet forecasters where they are. A large part of it is sorting through the possible base rates and making an argument for which ones are most relevant. Once the whole process is documented, then the two sides can argue on the line items.
Thank you ! glad you liked it. ☺️
LessWrong & EA is inundated with repeating the same.old arguments for ai x-risk in a hundred different formats. Could this really be the difference ?
Besides, arent superforecasters supposed to be the Kung Fu masters of doing their own research ;-)
I agree with you that a crux is base rate relevancy. Since there is no base rate for x-risk I’m unsure how to translate this to superforecaster language tho
Well, what base rates can inform the trajectory of AGI?
dominance of h sapiens over other hominids
historical errors in forecasting AI capabilities/timelines
impacts of new technologies on animals they have replaced
an analysis of what base rates AI has already violated
rate of bad individuals shaping world history
analysis of similarity of AI to the typical new technology that doesn’t cause extinction
success of terrorist attacks
impacts of covid
success of smallpox eradication
Would be an interesting exercise to do to flesh this out.