https://www.elilifland.com/. You can give me anonymous feedback here. I often change my mind and don’t necessarily endorse past writings.
elifland
Tagging @romeo who did our security forecast.
Oh I misunderstood you sorry. I think the form should have post-2023, not sure about the website because it adds complexity and I’m skeptical that it’s common that people are importantly confused by it as is.
Whew, a critique that our takeoff should be faster for a change, as opposed to slower.
Fun fact: AI-2027 estimates that getting to ASI might take the equivalent of a 100-person team of top human AI research talent working for tens of thousands of years.
(Calculation details: For example, in October 2027 of the AI-2027 modal scenario, they have “330K superhuman AI researcher copies thinking at 57x human speed”, which is 1.6 million person-years of research in that month alone. And that’s mostly going towards inventing ASI, I think. Did I get that right?)
This depends on how large you think the penalty is for parallelized labor as opposed to serial. If 330k parallel researchers is more like equivalent to 100 researchers at 50x speed than 100 researchers at 3,300x speed, then it’s more like a team of 100 researchers working for (50*57)/12=~250 years.
Also of course to the extent you think compute will be an important input, during October they still just have a month’s worth of total compute even though they’re working for 250-25,000 subjective years.
I’m curious why ASI would take so much work. What exactly is the R&D labor supposed to be doing each day, that adds up to so much effort? I’m curious how people are thinking about that, if they buy into this kind of picture. Thanks :)
I’m imagining that there’s a mix of investing tons of effort into optimizing experimenting ideas, implementing and interpreting every experiment quickly, as well as tons of effort into more conceptual agendas given the compute shortage, some of which bear fruit but also involve lots of “wasted” effort exploring possible routes, and most of which end up needing significant experimentation as well to get working.
(My own opinion, stated without justification, is that LLMs are not a paradigm that can scale to ASI, but after some future AI paradigm shift, there will be very very little R&D separating “this type of AI can do anything importantly useful at all” and “full-blown superintelligence”. Like maybe dozens or hundreds of person-years, or whatever, as opposed to millions. More on this in a (hopefully) forthcoming post.)
I don’t share this intuition regarding the gap between the first importantly useful AI and ASI. If so, that implies extremely fast takeoff, correct? Like on the order of days from AI that can do important things to full-blown superintelligence?
Currently there are hundreds or perhaps low thousands of years of relevant research effort going into frontier AI each year. The gap between importantly useful AI and ASI seems larger than a year of current AI progress (though I’m not >90% confident in that, especially if timelines are <2 years). Then we also need to take into account diminishing returns, compute bottlenecks, and parallelization penalties, so my guess is that the required person-years should be at minimum in the thousands and likely much more. Overall the scenario you’re describing is maybe (roughly) my 95th percentile speed?
I’m curious about your definition for importantly useful AI actually. Under some interpretations I feel like current AI should cross that bar.
I’m uncertain about the LLMs thing but would lean toward pretty large shifts by the time of ASI; I think it’s more likely LLMs scale to superhuman coders than to ASI.
I think it’s not worth getting into this too much more as I don’t feel strongly about the exact 1.05x, but I feel compelled to note a few quick things:
I’m not sure exactly what you mean by eating a smaller penalty but I think the labor->progress penalty is quite large
The right way to think about 1.05x vs. 1.2x is not a 75% reduction, but instead what is the exponent for which 1.05^n=1.2
Remember the 2022 vs. 2023 difference, though my guess is that the responses wouldn’t have been that sensitive to this
Also one more thing I’d like to pre-register: people who fill out the survey who aren’t frontier AI researchers will generally report higher speedups because their work is generally less compute-loaded and sometimes more greenfieldy or requiring less expertise, but we should give by far the most weight to frontier AI researchers.
Yup feel free to make that change, sounds good
No AI help seems harder to compare to since it’s longer ago, it seems easiest to think of something close to today as the baseline when thinking about future speedups. Also for timelines/takeoff modeling it’s a bit nicer to set the baseline to be more recent (looks like for those we again confusingly allowed 2024 AIs in the baseline as well rather than just 2023. Perhaps I should have standardized that with the side panel).
I’m not sure what the exact process was, tbh my guess is that they were estimated mostly independently but likely sanity checked with the survey to some extent in mind. It seems like they line up about right, given the 2022 vs. 2023 difference, the intuition regarding underadjusting for labor->progress, and giving weight to our own views as well rather than just the survey, given that we’ve thought more about this than survey takers (while of course they have the advantage of currently doing frontier AI research).
I’d make less of an adjustment if we asked people to give their reasoning including the adjustment from labor speedup to overall progress speedup and only included people who gave answers that demonstrated good understanding of this consideration and a not obviously unreasonable adjustment level.
Yup, seems good
I also realized that I believe that confusingly the survey asks about speedup vs. no post-2022 AIs, while I believe the scenario side panel is for no post-2023 AIs, which should make the side panel numbers lower, unclear exactly how much given 2023 AIs weren’t particularly useful.
Look at the question I mentioned above about the current productivity multiplier
I think a copy would be best, thanks!
I think the survey is an overestimate for the reason I gave above, I think this stuff is subtle and researchers are likely to underestimate the decrease from labor speedup to progress speedup, especially in this sort of survey where it didn’t involve discussing with them verbally. Based on their responses to other questions in the survey seems like at least 2 people didn’t understand the difference between labor and overall progress/productivity.
Here is the survey: https://forms.gle/6GUbPR159ftBQcVF6. The question we’re discussing is: “[optional] What is the current productivity multiplier on algorithmic progress due to AI assistance?”
Edit: Also we didn’t spend large amounts of time on these early numbers, they’re not meant to be that precise but just rough best guesses.
You mean the median would be at least 1.33x rather than the previous 1.2x? Sounds about right so don’t feel the need to bet against. Also I’m not planning on doing a follow-up survey but would be excited for others to.
Most of the responses were in Nov.
This was from Nov 2024 to Mar 2025 so fairly recent. I think the transition to faster was mostly due to the transition to reasoning models and perhaps the beginnings of increased generalization from shorter to longer time horizons.
Edit: the responses are from between Nov 2024 and Mar 2025. Responses are in increasing order: 1.05-1.1, 1.15, 1.2, 1.3, 2. The lowest one is the most recent but is from a former not current frontier AI researcher.
We did do a survey in late 2024 of 4 frontier AI researchers who estimated the speedup was about 1.1-1.2x. This is for their whole company, not themselves.
This also matches the vibe I’ve gotten when talking to other researchers, I’d guess they’re more likely to be overestimating than underestimating the effect due to not adjusting enough for my next point. Keep in mind that the multiplier is for overall research progress rather than a speedup on researchers’ labor, this lowers the multiplier by a bunch because compute/data are also inputs to progress.
If the trend isn’t inherently superexponential and continues at 7 month doubling times by default, it does seem hard to get to AGI within a few years. If it’s 4 months, IIRC in my timelines model it’s still usually after 2027 but it can be close because of intermediate AI R&D speedups depending on how big you think the gaps between benchmarks and the real world. I’d have to go back and look if we want a more precise answer. If you add error bars around the 4 month time, that increases the chance of AGI soon ofc.
If you treat the shift from 7 to 4 month doubling times as weak evidence of superexponential, that might be evidence in favor of 2027 timelines depending on your prior.
IMO how you should update on this just depends on your prior views (echoing Ryan’s comment). Daniel had 50% AGI by 2027 and did and should update to a bit lower. I’m at more like 20-25% and I think stay about the same (and I think Ryan is similar). I think if you have more like <=10% you should probably update upward.
It underrates the difficulty of automating the job of a researcher. Real world work environments are messy and contain lots of detail that are neglected in an abstraction purely focused on writing code and reasoning about the results of experiments. As a result, we shouldn’t expect automating AI R&D to be much easier than automating remote work in general.
I basically agree. The reason I expect AI R&D automation to happen before the rest of remote work isn’t because I think it’s fundamentally much easier, but because (a) companies will try to automate it before other remote work tasks, and relatedly (b) because companies have access to more data and expertise for AI R&D than other fields.
I still think full automation of remote work in 10 years is plausible, because it’s what we would predict if we straightforwardly extrapolate current rates of revenue growth and assume no slowdown. However, I would only give this outcome around 30% chance.
In an important sense I feel like Ege and I are not actually far off here. I’m at more like 65-70% on this. I think this overall recommends quite similar actions. Perhaps we have a more important disagreement regarding something like P(AGI within 3 years), for which I’m at approx. 25-30% and Ege might be very low (my probability mass is somewhat concentrated in the next 3 years due to an expectation that compute and algorithmic effort scaling will slow down around 2029 if AGI or close isn’t achieved).
My guess is that this disagreement is less important to make progress on than disagreements regarding takeoff speeds/dynamics and alignment difficulty.
- 28 Apr 2025 23:36 UTC; 2 points) 's comment on ryan_greenblatt’s Shortform by (
Sorry for the late reply.
I’m not 100% sure what you mean, but my guess is that you mean (B) to represent the compute used for experiments? We do project a split here and the copies/speed numbers are just for (A). You can see our projections for the split in our compute forecast (we are not confident that they are roughly right).
Re: the rest of your comment, makes sense. Perhaps the place I most disagree is that if LLMs will be the thing discovering the new paradigm, they will probably also be useful for things like automating alignment research, epistemics, etc. Also if they are misaligned they could sabotage the research involved in the paradigm shift.