“The most pressing practical question for future work is: why were superforecasters so unmoved by experts’ much higher estimates of AI extinction risk, and why were experts so unmoved by the superforecasters’ lower estimates? The most puzzling scientific question is: why did rational forecasters, incentivized by the XPT to persuade each other, not converge after months of debate and the exchange of millions of words and thousands of forecasts?”
This post by Peter McClusky, a participating superforecaster, renders the question essentially non-puzzling to me. Doing better would be fairly simple, although attracting and incentivising the relevant experts would be fairly expensive.
The questions were in many cases somewhat off from the endpoints we care about, or framed in ways that I believe would distort straightforward attempts to draw conclusions
The incentive structure of predicting the apocalypse is necessarily screwy, and using a Keynsian beauty prediction contest doesn’t really fix it
Most of the experts and superforecasters just don’t know much about AI, and thought that (as of 2022) the recent progress was basically just hype. Hopefully it’s now clear that this was just wrong?
Some selected quotes:
I didn’t notice anyone with substantial expertise in machine learning. Experts were apparently chosen based on having some sort of respectable publication related to AI, nuclear, climate, or biological catastrophic risks. … they’re likely to be more accurate than random guesses. But maybe not by a large margin.
Many superforecasters suspected that recent progress in AI was the same kind of hype that led to prior disappointments with AI. I didn’t find a way to get them to look closely enough to understand why I disagreed. My main success in that area was with someone who thought there was a big mystery about how an AI could understand causality. I pointed him to Pearl, which led him to imagine that problem might be solvable.
I didn’t see much evidence that either group knew much about the subject that I didn’t already know. So maybe most of the updates during the tournament were instances of the blind leading the blind. None of this seems to be as strong evidence as the changes, since the tournament, in opinions of leading AI researchers, such as Hinton and Bengio.
I think the core problem is actually that it’s really hard to get good public predictions of AI progress, in any more detail than “extrapolate compute spending, hardware price/performance, scaling laws, and then guess at what downstream-task performance that implies (and whether we’ll need a new paradigm for AGI [tbc: no!])”. To be clear, I think that’s a stronger baseline than the forecasting tournament achieved!
But downstream task performance is hard to predict, and there’s a fair bit of uncertainty in the other parameters too. Details are somewhere between “trade secrets” and “serious infohazards”, and the people who are best at predicting AI progress are mostly—for that reason! - work at frontier labs with AI-xrisk-mitigation efforts. I think it’s likely that inferring frontier lab [members]’s beliefs from their actions and statements would give you better estimates than another such tournament.
The interesting thing to me about the question, “Will we need a new paradigm for AGI?” is that a lot of people seem to be focused on this but I think it misses a nearby important question.
As we get closer to a complete AGI, and start to get more capable programming and research assistant AIs, will those make algorithmic exploration cheaper and easier, such that we see a sort of ‘Cambrian explosion’ of model architectures which work well for specific purposes, and perhaps one of these works better at general learning than anything we’ve found so far and ends up being the architecture that first reaches full transformative AGI?
The point I’m generally trying to make is that estimates of software/algorithmic progress are based on the progress being made (currently) mostly by human minds. The closer we get to generally competent artificial minds, the less we should expect past patterns based on human inputs to hold.
I generally agree, i just have some specific evidence which I believe should adjust estimates in the report towards expecting more accessible algorithmic improvements than some people seem to think.
This post by Peter McClusky, a participating superforecaster, renders the question essentially non-puzzling to me. Doing better would be fairly simple, although attracting and incentivising the relevant experts would be fairly expensive.
The questions were in many cases somewhat off from the endpoints we care about, or framed in ways that I believe would distort straightforward attempts to draw conclusions
The incentive structure of predicting the apocalypse is necessarily screwy, and using a Keynsian
beautyprediction contest doesn’t really fix itMost of the experts and superforecasters just don’t know much about AI, and thought that (as of 2022) the recent progress was basically just hype. Hopefully it’s now clear that this was just wrong?
Some selected quotes:
I think the core problem is actually that it’s really hard to get good public predictions of AI progress, in any more detail than “extrapolate compute spending, hardware price/performance, scaling laws, and then guess at what downstream-task performance that implies (and whether we’ll need a new paradigm for AGI [tbc: no!])”. To be clear, I think that’s a stronger baseline than the forecasting tournament achieved!
But downstream task performance is hard to predict, and there’s a fair bit of uncertainty in the other parameters too. Details are somewhere between “trade secrets” and “serious infohazards”, and the people who are best at predicting AI progress are mostly—for that reason! - work at frontier labs with AI-xrisk-mitigation efforts. I think it’s likely that inferring frontier lab [members]’s beliefs from their actions and statements would give you better estimates than another such tournament.
The interesting thing to me about the question, “Will we need a new paradigm for AGI?” is that a lot of people seem to be focused on this but I think it misses a nearby important question.
As we get closer to a complete AGI, and start to get more capable programming and research assistant AIs, will those make algorithmic exploration cheaper and easier, such that we see a sort of ‘Cambrian explosion’ of model architectures which work well for specific purposes, and perhaps one of these works better at general learning than anything we’ve found so far and ends up being the architecture that first reaches full transformative AGI?
The point I’m generally trying to make is that estimates of software/algorithmic progress are based on the progress being made (currently) mostly by human minds. The closer we get to generally competent artificial minds, the less we should expect past patterns based on human inputs to hold.
Tom Davidson’s work on a compute-centric framework for takeoff speed is excellent, IMO.
I generally agree, i just have some specific evidence which I believe should adjust estimates in the report towards expecting more accessible algorithmic improvements than some people seem to think.