If your model says that LLMs are unlikely to scale up to ASI, this is not sufficient for low p(doom). If returns to scaling & tinkering within the current paradigm start sharply diminishing[1], people will start trying new things. Some of them will eventually work.
Such a world could even be more dangerous. LLMs are steerable and relatively weak at consequentialist planning. There is AFAICT no fundamental reason why the next paradigm couldn’t be even less interpretable, less steerable, and more capable of dangerous optimization at a given level of economic utility.
I have a pretty huge amount of uncertainty about the distribution of how hypothetical future paradigms score on those (and other) dimensions, but there does seem room for it to be worse, yeah.
ETA: (To be clear, something that looks relevantly like today’s LLMs while still having superhuman scientific R&D capabilities seems quite scary and I think if we find ourselves there in, say, 5 years, then we’re pretty fucked. I don’t want anyone to think that I’m particularly optimistic about the current paradigm’s safety properties.)
Hypothetical autonomous researcher LLMs are 100x faster than humans, so such LLMs quickly improve over LLMs. That is, non-ASI LLMs may be the ones trying new things, as soon as they reach autonomous research capability.
The crux is then whether LLMs scale up to autonomous researchers (through mostly emergent ability, not requiring significantly novel scaffolding or post-training), not whether they scale up directly to ASI.
If your model says that LLMs are unlikely to scale up to ASI, this is not sufficient for low p(doom). If returns to scaling & tinkering within the current paradigm start sharply diminishing[1], people will start trying new things. Some of them will eventually work.
Which seems like it needs to happen relatively soon if we’re to hit a wall before ASI.
Such a world could even be more dangerous. LLMs are steerable and relatively weak at consequentialist planning. There is AFAICT no fundamental reason why the next paradigm couldn’t be even less interpretable, less steerable, and more capable of dangerous optimization at a given level of economic utility.
I have a pretty huge amount of uncertainty about the distribution of how hypothetical future paradigms score on those (and other) dimensions, but there does seem room for it to be worse, yeah.
ETA: (To be clear, something that looks relevantly like today’s LLMs while still having superhuman scientific R&D capabilities seems quite scary and I think if we find ourselves there in, say, 5 years, then we’re pretty fucked. I don’t want anyone to think that I’m particularly optimistic about the current paradigm’s safety properties.)
I have a post saying the same thing :)
Ah, yep, I read it at the time; this has just been on my mind lately and sometimes it bears to repeat the obvious.
Hypothetical autonomous researcher LLMs are 100x faster than humans, so such LLMs quickly improve over LLMs. That is, non-ASI LLMs may be the ones trying new things, as soon as they reach autonomous research capability.
The crux is then whether LLMs scale up to autonomous researchers (through mostly emergent ability, not requiring significantly novel scaffolding or post-training), not whether they scale up directly to ASI.