My deeply concerning impression is that OpenPhil (and the average funder) has timelines 2-3x longer than the median safety researcher. Daniel has his AGI training requirements set to 3e29, and I believe the 15th-85th percentiles among safety researchers would span 1e31 +/- 2 OOMs. On that view, Tom’s default values are off in the tails.
My suspicion is that funders write-off this discrepancy, if noticed, as inside-view bias i.e. thinking safety researchers self-select for scaling optimism. My, admittedly very crude, mental model of an OpenPhil funder makes two further mistakes in this vein: (1) Mistakenly taking the Cotra report’s biological anchors weighting as a justified default setting of parameters rather than an arbitrary choice which should be updated given recent evidence. (2) Far overweighting the semi-informative priors report despite semi-informative priors abjectly failing to have predicted Turing-test level AI progress. Semi-informative priors apply to large-scale engineering efforts which for the AI domain has meant AGI and the Turing test. Insofar as funders admit that the engineering challenges involved in passing the Turing test have been solved, they should discard semi-informative priors as failing to be predictive of AI progress.
To be clear, I see my empirical claim about disagreement between the funding and safety communities as most important—independently of my diagnosis of this disagreement. If this empirical claim is true, OpenPhil should investigate cruxes separating them from safety researchers, and at least allocate some of their budget on the hypothesis that the safety community is correct.
A long time ago I spent a few months reading and thinking about Ajeya’s bio anchors report. I played around with the spreadsheet version of it, trying out all sorts of different settings, and in particular changing the various settings to values that I thought were more plausible.
As a result I figured out what the biggest cruxes were between me and Ajeya—the differences in variable-settings that led to the largest differences in our timelines.
The biggest one was (unsurprisingly, in retrospect) the difference in where we put our probability mass for the training requirements distribution. That in turn broke down into several sub-cruxes.
I wrote Fun with +12 OOMs to draw everyone’s attention to that big uber-crux. In addition to just pointing out that uber-crux, my post also operationalized it and explained it so that people didn’t have to be super familiar with Ajeya’s report to understand what the debate was about. Also, I gave five examples of things you could do with +12 OOMs, very concrete examples, which people could then argue about, in the service of answering the uber-crux.
So, what I would like to see now is the same thing I wanted to see after writing the post, i.e. what I hoped to inspire with the post: A vigorous debate over questions like “What are the reasons to think OmegaStar would constitute AGI/TAI/etc.? What are the reasons to think it wouldn’t?” and “What about Crystal Nights?” and “What about a smaller version of OmegaStar, that was only +6 OOMs instead of +12? Is that significantly less likely to work, or is the list of reasons why it might or might not work basically the same?” All in the service of answering the Big Crux, i.e. probability that +12 OOMs would be enough / more generally, what the probability distribution over OOMs should be.
Possibly offer a prize on formalizing and/or distilling the argument for deception (Also its constituents i.e. gradient hacking, situational awareness, non-myopia)
How should we model software progress? In particular, what is the right function for modeling short-term return on investment to algorithmic progress?
My guess is that most researchers with short timelines think, as I do, that there’s lots of low-hanging fruit here. Funders may underestimate the prevalence of this opinion, since most safety researchers do not talk about details here to avoid capabilities acceleration.
Basically, a rational reason to have longer timelines is the fact that there’s a non-trivial chance that safety researchers are wrong due to selection effects, community epistemic problems, and overestimating the impact of AGI.
There’s definitely a bias/selection effect pushing this community towards having shorter timelines. However, there’s also definitely a bias/selection effect pushing the world in general towards having longer timelines—the anti-weirdness heuristic, wanting-to-not-sound-like-a-crackpot heuristic, wanting-to-sound-like-a-sober-skeptic bias, and probably lots of others that I’m not thinking of. Oh yeah, and just general ignorance of history and the topic of tech progress in particular. I suspect that on the whole, the biases pushing people towards longer timelines are stronger than the biases pushing people towards shorter timelines. (Obviously it differs case by case; in some people the biases are stronger one way, in other people the biases are stronger in the other way. And in a few rare individuals the biases mostly cancel out or are not strong in the first place.)
I generally prefer to make up my mind about important questions by reasoning them through on the object level, rather than by trying to guess which biases are strongest and then guess how much I should adjust to correct for them. And I especially recommend doing that in this case.
That post seems to mainly address high P(doom) arguments and reject them. I agree with some of those arguments and the rejection of high P(doom). I don’t see as direct of a relevance to my previous comment. As for the broader point of self-selection, I think this is important, but cuts both ways: funders are selected to be competent generalists (and are biased towards economic arguments) as such they are pre-disposed to under-update on inside views. As an extreme case of this consider e.g. Bryan Caplan.
Here are comments on two of Nuno’s arguments which do apply to AGI timelines:
(A) “Difference between in-argument reasoning and all-things-considered reasoning” this seems closest to my point (1) which is often an argument for shorter timelines.
(B) “there is a small but intelligent community of people who have spent significant time producing some convincing arguments about AGI, but no community which has spent the same amount of effort”. This strikes me as important, but likely not true without heavy caveats. Academia celebrates works pointing out clear limitations of existing work e.g. Will Merill’s work [1,2] and Inverse Scaling Laws. It’s true that there’s no community organized around this work, but the important variables are incentives/scale/number-of-researcher-hours—not community.
My deeply concerning impression is that OpenPhil (and the average funder) has timelines 2-3x longer than the median safety researcher. Daniel has his AGI training requirements set to 3e29, and I believe the 15th-85th percentiles among safety researchers would span 1e31 +/- 2 OOMs. On that view, Tom’s default values are off in the tails.
My suspicion is that funders write-off this discrepancy, if noticed, as inside-view bias i.e. thinking safety researchers self-select for scaling optimism. My, admittedly very crude, mental model of an OpenPhil funder makes two further mistakes in this vein: (1) Mistakenly taking the Cotra report’s biological anchors weighting as a justified default setting of parameters rather than an arbitrary choice which should be updated given recent evidence. (2) Far overweighting the semi-informative priors report despite semi-informative priors abjectly failing to have predicted Turing-test level AI progress. Semi-informative priors apply to large-scale engineering efforts which for the AI domain has meant AGI and the Turing test. Insofar as funders admit that the engineering challenges involved in passing the Turing test have been solved, they should discard semi-informative priors as failing to be predictive of AI progress.
To be clear, I see my empirical claim about disagreement between the funding and safety communities as most important—independently of my diagnosis of this disagreement. If this empirical claim is true, OpenPhil should investigate cruxes separating them from safety researchers, and at least allocate some of their budget on the hypothesis that the safety community is correct.
What concrete cruxes would you most like to see investigated?
To elaborate on what Jacob said:
A long time ago I spent a few months reading and thinking about Ajeya’s bio anchors report. I played around with the spreadsheet version of it, trying out all sorts of different settings, and in particular changing the various settings to values that I thought were more plausible.
As a result I figured out what the biggest cruxes were between me and Ajeya—the differences in variable-settings that led to the largest differences in our timelines.
The biggest one was (unsurprisingly, in retrospect) the difference in where we put our probability mass for the training requirements distribution. That in turn broke down into several sub-cruxes.
I wrote Fun with +12 OOMs to draw everyone’s attention to that big uber-crux. In addition to just pointing out that uber-crux, my post also operationalized it and explained it so that people didn’t have to be super familiar with Ajeya’s report to understand what the debate was about. Also, I gave five examples of things you could do with +12 OOMs, very concrete examples, which people could then argue about, in the service of answering the uber-crux.
So, what I would like to see now is the same thing I wanted to see after writing the post, i.e. what I hoped to inspire with the post: A vigorous debate over questions like “What are the reasons to think OmegaStar would constitute AGI/TAI/etc.? What are the reasons to think it wouldn’t?” and “What about Crystal Nights?” and “What about a smaller version of OmegaStar, that was only +6 OOMs instead of +12? Is that significantly less likely to work, or is the list of reasons why it might or might not work basically the same?” All in the service of answering the Big Crux, i.e. probability that +12 OOMs would be enough / more generally, what the probability distribution over OOMs should be.
This is an empirical question, so I may be missing some key points. Anyway here are a few:
My above points on Ajeya anchors and semi-informative priors
Or, put another way, why reject Daniel’s post?
Can deception precede economically TAI?
Possibly offer a prize on formalizing and/or distilling the argument for deception (Also its constituents i.e. gradient hacking, situational awareness, non-myopia)
How should we model software progress? In particular, what is the right function for modeling short-term return on investment to algorithmic progress?
My guess is that most researchers with short timelines think, as I do, that there’s lots of low-hanging fruit here. Funders may underestimate the prevalence of this opinion, since most safety researchers do not talk about details here to avoid capabilities acceleration.
Basically, a rational reason to have longer timelines is the fact that there’s a non-trivial chance that safety researchers are wrong due to selection effects, community epistemic problems, and overestimating the impact of AGI.
Link below:
https://forum.effectivealtruism.org/posts/L6ZmggEJw8ri4KB8X/my-highly-personal-skepticism-braindump-on-existential-risk#comments
There’s definitely a bias/selection effect pushing this community towards having shorter timelines. However, there’s also definitely a bias/selection effect pushing the world in general towards having longer timelines—the anti-weirdness heuristic, wanting-to-not-sound-like-a-crackpot heuristic, wanting-to-sound-like-a-sober-skeptic bias, and probably lots of others that I’m not thinking of. Oh yeah, and just general ignorance of history and the topic of tech progress in particular. I suspect that on the whole, the biases pushing people towards longer timelines are stronger than the biases pushing people towards shorter timelines. (Obviously it differs case by case; in some people the biases are stronger one way, in other people the biases are stronger in the other way. And in a few rare individuals the biases mostly cancel out or are not strong in the first place.)
I generally prefer to make up my mind about important questions by reasoning them through on the object level, rather than by trying to guess which biases are strongest and then guess how much I should adjust to correct for them. And I especially recommend doing that in this case.
That post seems to mainly address high P(doom) arguments and reject them. I agree with some of those arguments and the rejection of high P(doom). I don’t see as direct of a relevance to my previous comment. As for the broader point of self-selection, I think this is important, but cuts both ways: funders are selected to be competent generalists (and are biased towards economic arguments) as such they are pre-disposed to under-update on inside views. As an extreme case of this consider e.g. Bryan Caplan.
Here are comments on two of Nuno’s arguments which do apply to AGI timelines:
(A) “Difference between in-argument reasoning and all-things-considered reasoning” this seems closest to my point (1) which is often an argument for shorter timelines.
(B) “there is a small but intelligent community of people who have spent significant time producing some convincing arguments about AGI, but no community which has spent the same amount of effort”. This strikes me as important, but likely not true without heavy caveats. Academia celebrates works pointing out clear limitations of existing work e.g. Will Merill’s work [1,2] and Inverse Scaling Laws. It’s true that there’s no community organized around this work, but the important variables are incentives/scale/number-of-researcher-hours—not community.
Yeah, I agree that the disagreement is probably more important to resolve, and I haven’t much addressed that.