I have a pretty huge amount of uncertainty about the distribution of how hypothetical future paradigms score on those (and other) dimensions, but there does seem room for it to be worse, yeah.
ETA: (To be clear, something that looks relevantly like today’s LLMs while still having superhuman scientific R&D capabilities seems quite scary and I think if we find ourselves there in, say, 5 years, then we’re pretty fucked. I don’t want anyone to think that I’m particularly optimistic about the current paradigm’s safety properties.)
I have a pretty huge amount of uncertainty about the distribution of how hypothetical future paradigms score on those (and other) dimensions, but there does seem room for it to be worse, yeah.
ETA: (To be clear, something that looks relevantly like today’s LLMs while still having superhuman scientific R&D capabilities seems quite scary and I think if we find ourselves there in, say, 5 years, then we’re pretty fucked. I don’t want anyone to think that I’m particularly optimistic about the current paradigm’s safety properties.)