Interestingly, I’ve heard from tons of skeptics I’ve talked to (e.g. Tim Lee, CSET people, AI Snake Oil) that timelines to actual impacts in the world (such as significant R&D acceleration or industrial acceleration) are going to be way longer than we say because AIs are too unreliable and risky, therefore people won’t use them. I was more dismissive of this argument before but:
It matches my own lived experience (e.g. I still use search way more than LLMs, even to learn about complex topics, because I have good Google Fu and LLMs make stuff up too much).
As you say, it seems like a plausible explanation for why my weird friends make way more use out of coding agents than giant AI companies.
I tentatively remain dismissive of this argument. My claim was never “AIs are actually reliable and safe now” such that your lived experience would contradict it. I too predicted that AIs would be unreliable and risky in the near-term. My prediction is that after the intelligence explosion the best AIs will be reliable and safe (insofar as they want to be, that is.)
...I guess just now I was responding to a hypothetical interlocutor who agrees that AI R&D automation could come soon but thinks that that doesn’t count as “actual impacts in the world.” I’ve met many such people, people who think that software-only singularity is unlikely, people who like to talk about real-world bottlenecks, etc. But you weren’t describing such a person, you were describing someone who also thinks we won’t be able to automate AI R&D for a long time.
There I’d say… well, we’ll see. I agree that AIs are unreliable and risky and that therefore they’ll be able to do impressive-seeming stuff that looks like they could automate AI R&D well before they actually automate AI R&D in practice. But… probably by the end of 2025 they’ll be hitting that first milestone (imagine e.g. an AI that crushes RE-Bench and also can autonomously research & write ML papers, except the ML papers are often buggy and almost always banal / unimportant, and the experiments done to make them had a lot of bugs and wasted compute, and thus AI companies would laugh at the suggestion of putting said AI in charge of a bunch of GPUs and telling it to cook.) And then two years later maybe they’ll be able to do it for real, reliably, in practice, such that AGI takeoff happens.
Maybe another thing I’d say is “One domain where AIs seem to be heavily used in practice, is coding, especially coding at frontier AI companies (according to friends who work at these companies and report fairly heavy usage). This suggests that AI R&D automation will happen more or less on schedule.”
I’m not talking about narrowly your claim; I just think this very fundamentally confuses most people’s basic models of the world. People expect, from their unspoken models of “how technological products improve,” that long before you get a mind-bendingly powerful product that’s so good it can easily kill you, you get something that’s at least a little useful to you (and then you get something that’s a little more useful to you, and then something that’s really useful to you, and so on). And in fact that is roughly how it’s working — for programmers, not for a lot of other people.
Because I’ve engaged so much with the conceptual case for an intelligence explosion (i.e. the case that this intuitive model of technology might be wrong), I roughly buy it even though I am getting almost no use out of AIs still. But I have a huge amount of personal sympathy for people who feel really gaslit by it all.
To put it another way: we probably both agree that if we had gotten AI personal assistants that shop for you and book meetings for you in 2024, that would have been at least some evidence for shorter timelines. So their absence is at least some evidence for longer timelines. The question is what your underlying causal model was: did you think that if we were going to get superintelligence by 2027, then we really should see personal assistants in 2024? A lot of people strongly believe that, you (Daniel) hardly believe it at all, and I’m somewhere in the middle.
If we had gotten both the personal assistants I was expecting, and the 2x faster benchmark progress than I was expecting, my timelines would be the same as yours are now.
That’s reasonable. Seems worth mentioning that I did make predictions in What 2026 Looks Like, and eyeballing them now I don’t think I was saying that we’d have personal assistants that shop for you and book meetings for you in 2024, at least not in a way that really works. (I say at the beginning of 2026 “The age of the AI assistant has finally dawned.”) In other words I think even in 2021 I was thinking that widespread actually useful AI assistants would happen about a year or two before superintelligence. (Not because I have opinions about the orderings of technologies in general, but because I think that once an AGI company has had a popular working personal assistant for two years they should be able to figure out how to make a better version that dramatically speeds up their R&D.)
Indeed, I believe this is the main explanation for why my median timelines are longer than say situational awareness, and why AI isn’t nearly as impactful as people used to think back in the day.
The big difference from a lot of skeptics is I believe this adds at most 1-2 decades to the timeline, not multiple decades to make AI very, very useful.
i’ve recently done more AI agents running amok and i’ve found Claude was actually more aligned and did stuff i asked it not to much less than oai models enough that it actaully made a difference lol
lol what? Can you compile/summarize a list of examples of AI agents running amok in your personal experience? To what extent was it an alignment problem vs. a capabilities problem?
not running amock, just not reliably following instructions “only modify files in this folder” or “don’t install pip packages”. Claude follows instructions correctly, some other models are mode collapsed into a certain way of doing things, eg gpt-4o always thinks it’s running python in chatgpt code interpreter and you need very strong prompting to make it behave in a way specific to your computer
a hypothetical typical example would be it tries to use the file /usr/bin/python because it’s memorized that that’s the path to python, that fails, then it concludes it must create that folder which would require sudo permissions, if it can it could potentially mess something
Another effect here is that the AI companies often don’t want to be as reckless as I am, e.g. letting agents run amok on my machines.
Interestingly, I’ve heard from tons of skeptics I’ve talked to (e.g. Tim Lee, CSET people, AI Snake Oil) that timelines to actual impacts in the world (such as significant R&D acceleration or industrial acceleration) are going to be way longer than we say because AIs are too unreliable and risky, therefore people won’t use them. I was more dismissive of this argument before but:
It matches my own lived experience (e.g. I still use search way more than LLMs, even to learn about complex topics, because I have good Google Fu and LLMs make stuff up too much).
As you say, it seems like a plausible explanation for why my weird friends make way more use out of coding agents than giant AI companies.
I tentatively remain dismissive of this argument. My claim was never “AIs are actually reliable and safe now” such that your lived experience would contradict it. I too predicted that AIs would be unreliable and risky in the near-term. My prediction is that after the intelligence explosion the best AIs will be reliable and safe (insofar as they want to be, that is.)
...I guess just now I was responding to a hypothetical interlocutor who agrees that AI R&D automation could come soon but thinks that that doesn’t count as “actual impacts in the world.” I’ve met many such people, people who think that software-only singularity is unlikely, people who like to talk about real-world bottlenecks, etc. But you weren’t describing such a person, you were describing someone who also thinks we won’t be able to automate AI R&D for a long time.
There I’d say… well, we’ll see. I agree that AIs are unreliable and risky and that therefore they’ll be able to do impressive-seeming stuff that looks like they could automate AI R&D well before they actually automate AI R&D in practice. But… probably by the end of 2025 they’ll be hitting that first milestone (imagine e.g. an AI that crushes RE-Bench and also can autonomously research & write ML papers, except the ML papers are often buggy and almost always banal / unimportant, and the experiments done to make them had a lot of bugs and wasted compute, and thus AI companies would laugh at the suggestion of putting said AI in charge of a bunch of GPUs and telling it to cook.) And then two years later maybe they’ll be able to do it for real, reliably, in practice, such that AGI takeoff happens.
Maybe another thing I’d say is “One domain where AIs seem to be heavily used in practice, is coding, especially coding at frontier AI companies (according to friends who work at these companies and report fairly heavy usage). This suggests that AI R&D automation will happen more or less on schedule.”
I’m not talking about narrowly your claim; I just think this very fundamentally confuses most people’s basic models of the world. People expect, from their unspoken models of “how technological products improve,” that long before you get a mind-bendingly powerful product that’s so good it can easily kill you, you get something that’s at least a little useful to you (and then you get something that’s a little more useful to you, and then something that’s really useful to you, and so on). And in fact that is roughly how it’s working — for programmers, not for a lot of other people.
Because I’ve engaged so much with the conceptual case for an intelligence explosion (i.e. the case that this intuitive model of technology might be wrong), I roughly buy it even though I am getting almost no use out of AIs still. But I have a huge amount of personal sympathy for people who feel really gaslit by it all.
To put it another way: we probably both agree that if we had gotten AI personal assistants that shop for you and book meetings for you in 2024, that would have been at least some evidence for shorter timelines. So their absence is at least some evidence for longer timelines. The question is what your underlying causal model was: did you think that if we were going to get superintelligence by 2027, then we really should see personal assistants in 2024? A lot of people strongly believe that, you (Daniel) hardly believe it at all, and I’m somewhere in the middle.
If we had gotten both the personal assistants I was expecting, and the 2x faster benchmark progress than I was expecting, my timelines would be the same as yours are now.
That’s reasonable. Seems worth mentioning that I did make predictions in What 2026 Looks Like, and eyeballing them now I don’t think I was saying that we’d have personal assistants that shop for you and book meetings for you in 2024, at least not in a way that really works. (I say at the beginning of 2026 “The age of the AI assistant has finally dawned.”) In other words I think even in 2021 I was thinking that widespread actually useful AI assistants would happen about a year or two before superintelligence. (Not because I have opinions about the orderings of technologies in general, but because I think that once an AGI company has had a popular working personal assistant for two years they should be able to figure out how to make a better version that dramatically speeds up their R&D.)
Indeed, I believe this is the main explanation for why my median timelines are longer than say situational awareness, and why AI isn’t nearly as impactful as people used to think back in the day.
The big difference from a lot of skeptics is I believe this adds at most 1-2 decades to the timeline, not multiple decades to make AI very, very useful.
Yeah TBC, I’m at even less than 1-2 decades added, more like 1-5 years.
i’ve recently done more AI agents running amok and i’ve found Claude was actually more aligned and did stuff i asked it not to much less than oai models enough that it actaully made a difference lol
lol what? Can you compile/summarize a list of examples of AI agents running amok in your personal experience? To what extent was it an alignment problem vs. a capabilities problem?
not running amock, just not reliably following instructions “only modify files in this folder” or “don’t install pip packages”. Claude follows instructions correctly, some other models are mode collapsed into a certain way of doing things, eg gpt-4o always thinks it’s running python in chatgpt code interpreter and you need very strong prompting to make it behave in a way specific to your computer
a hypothetical typical example would be it tries to use the file /usr/bin/python because it’s memorized that that’s the path to python, that fails, then it concludes it must create that folder which would require sudo permissions, if it can it could potentially mess something