Systematic surveys aggregating opinions from hundreds of AI researchers.
Reports that Open Philanthropy employees spent thousands of hours on, systematically presenting evidence and considering arguments and counterarguments.
There’s plenty of room for debate on how much these measures should be expected to improve our foresight, compared to what the “Big Three” were doing.
My guess would be these measures result in predictions somewhat worse than the Big Three. If you want a reference class for “more serious” forecasting, I’d say go look for forecasts by fancy consulting agencies or thinktanks. My guess would be that they do somewhat worse, mainly because their authors are optimizing to Look Respectable rather than just optimizing purely for accuracy. And the AI researcher surveys and OpenPhil reports also sure do look like they’re optimizing a significant amount for Looking Respectable.
Is the point that 1) AGI specifically is too weird for normal forecasting to work, or 2) that you don’t trust judgmental forecasting in general, or 3) that respectability bias swamps the gains from aggregating a heavily selected crowd, spending more time, and debiasing in other ways?
The OpenPhil longtermists’ respectability bias seems fairly small to me; their weirderstuff is comparable to Asimov (but not Clarke, who wrote a whole book about cryptids).
And against this, you have to factor in the Big Three’s huge bias towards being entertaining instead of accurate (as well as e.g. Heinlein’s inability to admit error).
Is the point that 1) AGI specifically is too weird for normal forecasting to work, or 2) that you don’t trust judgmental forecasting in general, or 3) that respectability bias swamps the gains from aggregating a heavily selected crowd, spending more time, and debiasing in other ways?
The third: respectability bias easily swamps the gains. (I’m not going to try to argue that case here, just give a couple examples of what such tradeoffs look like.)
This is much more about the style of analysis/reasoning than about the topics; OpenPhil is certainly willing to explore weird topics.
As an example, let’s look at the nanotech risk project you linked to. The very first thing in that write-up is:
According to the definition set by the U.S. National Nanotechnology Initiative:
Nanotechnology is...
So right at the very beginning, we’re giving an explicit definition. That’s almost always an epistemically bad move. It makes the reasoning about “nanotech” seem more legible, but in actual fact the reasoning in the write-up was based on an intuitive notion of “nanotech”, not on this supposed definition. If the author actually wanted to rely on this definition, and not drag in intuitions about nanotech which don’t follow from the supposed definition, then the obvious thing to do would be to make up a new word—like “flgurgle”—and give “flgurgle” the definition. And then the whole report could talk about risks from flgurgle, and not have to worry about accidentally dragging in unjustified intuitions about “nanotech”.
But a definition does sound very Official and Respectable and Defendable. It’s even from an Official Government Source. Starting with a definition is a fine example of making a report more Respectable in a way which makes its epistemics worse.
(The actual thing one should usually do instead of give an explicit definition is say “we’re trying to point to a vague cluster of stuff like <list of examples>”. And, in fairness, the definition used for nanotech in the report does do that to some extent; it does actually do a decent job avoiding the standard pitfalls of “definitions”. But the US National Nanotechnology Initiative’s definition is still, presumably, optimized more for academic politics than for accurately conveying the intuitive notion of “nanotech”.)
The explanation of “Atomically Precise Manufacturing” two sections later is better, though it’s mostly just summarizing Drexler.
Fast forward to the section on “Will it eventually be possible to develop APM?”. Most of the space in this section is spent summarizing two reports:
The feasibility of atomically precise manufacturing has been reviewed in a report published by the US National Academy of Sciences (NAS). The NAS report was initiated in response to a Congressional request, and the result was included in the first triennial review of the U.S. National Nanotechnology Initiative. [...]
and
A Royal Society report was dismissive of the feasibility of ‘molecular manufacturing,’ [...]
Ok, so here we have two reports which absolutely scream “academic politics” and are very obviously optimized for Respectability (Congressional request! Triennial review! Institutional acronyms (IA)!) rather than accuracy. Given that the author of the OpenPhil piece went looking for stuff like that, we can make some inferences about the relative prioritization of Respectability and accuracy for the person writing this report.
So that’s two examples of Respectability/accuracy tradeoff (definitions and looking for Official Institutional Reports).
My guess would be these measures result in predictions somewhat worse than the Big Three. If you want a reference class for “more serious” forecasting, I’d say go look for forecasts by fancy consulting agencies or thinktanks. My guess would be that they do somewhat worse, mainly because their authors are optimizing to Look Respectable rather than just optimizing purely for accuracy. And the AI researcher surveys and OpenPhil reports also sure do look like they’re optimizing a significant amount for Looking Respectable.
Is the point that 1) AGI specifically is too weird for normal forecasting to work, or 2) that you don’t trust judgmental forecasting in general, or 3) that respectability bias swamps the gains from aggregating a heavily selected crowd, spending more time, and debiasing in other ways?
The OpenPhil longtermists’ respectability bias seems fairly small to me; their weirder stuff is comparable to Asimov (but not Clarke, who wrote a whole book about cryptids).
And against this, you have to factor in the Big Three’s huge bias towards being entertaining instead of accurate (as well as e.g. Heinlein’s inability to admit error).
Can you point at examples? (Bio anchors?)
The third: respectability bias easily swamps the gains. (I’m not going to try to argue that case here, just give a couple examples of what such tradeoffs look like.)
This is much more about the style of analysis/reasoning than about the topics; OpenPhil is certainly willing to explore weird topics.
As an example, let’s look at the nanotech risk project you linked to. The very first thing in that write-up is:
So right at the very beginning, we’re giving an explicit definition. That’s almost always an epistemically bad move. It makes the reasoning about “nanotech” seem more legible, but in actual fact the reasoning in the write-up was based on an intuitive notion of “nanotech”, not on this supposed definition. If the author actually wanted to rely on this definition, and not drag in intuitions about nanotech which don’t follow from the supposed definition, then the obvious thing to do would be to make up a new word—like “flgurgle”—and give “flgurgle” the definition. And then the whole report could talk about risks from flgurgle, and not have to worry about accidentally dragging in unjustified intuitions about “nanotech”.
… of course that would be dumb, and not actually result in a good report, because using explicit definitions is usually a bad idea. Explicit definitions just don’t match the way the human brain actually uses words.
But a definition does sound very Official and Respectable and Defendable. It’s even from an Official Government Source. Starting with a definition is a fine example of making a report more Respectable in a way which makes its epistemics worse.
(The actual thing one should usually do instead of give an explicit definition is say “we’re trying to point to a vague cluster of stuff like <list of examples>”. And, in fairness, the definition used for nanotech in the report does do that to some extent; it does actually do a decent job avoiding the standard pitfalls of “definitions”. But the US National Nanotechnology Initiative’s definition is still, presumably, optimized more for academic politics than for accurately conveying the intuitive notion of “nanotech”.)
The explanation of “Atomically Precise Manufacturing” two sections later is better, though it’s mostly just summarizing Drexler.
Fast forward to the section on “Will it eventually be possible to develop APM?”. Most of the space in this section is spent summarizing two reports:
and
Ok, so here we have two reports which absolutely scream “academic politics” and are very obviously optimized for Respectability (Congressional request! Triennial review! Institutional acronyms (IA)!) rather than accuracy. Given that the author of the OpenPhil piece went looking for stuff like that, we can make some inferences about the relative prioritization of Respectability and accuracy for the person writing this report.
So that’s two examples of Respectability/accuracy tradeoff (definitions and looking for Official Institutional Reports).