Nothing in this post or the associated logic says LLMs make AGI safe, just safer than what we were worried about.
Nobody with any sense predicted runaway AGI by this point in history. There’s no update from other forms not working yet.
There’s a weird thing where lots of people’s p(doom) went up when LLMs started to work well, because they found it an easier route to intellligence than they’d been expecting. If it’s easier it happens sooner and with less thought surrounding it.
LLMs are easy to turn into agents, so let’s don’t get complacent. But they are remarkably easy to control and align, so that’s good news for aligning the agents we build from them. But that doesn’t get us out of the woods; there are new issues with self-reflective, continuously learning agents, and there’s plenty of room for misuse and conflict escalation in a multipolar scenario, even if alignment turns out to be dead easy if you bother to try.
Nothing in this post or the associated logic says LLMs make AGI safe, just safer than what we were worried about.
Nobody with any sense predicted runaway AGI by this point in history. There’s no update from other forms not working yet.
There’s a weird thing where lots of people’s p(doom) went up when LLMs started to work well, because they found it an easier route to intellligence than they’d been expecting. If it’s easier it happens sooner and with less thought surrounding it.
See Porby’s comment on his risk model for language model agents. It’s a more succinct statement of my views.
LLMs are easy to turn into agents, so let’s don’t get complacent. But they are remarkably easy to control and align, so that’s good news for aligning the agents we build from them. But that doesn’t get us out of the woods; there are new issues with self-reflective, continuously learning agents, and there’s plenty of room for misuse and conflict escalation in a multipolar scenario, even if alignment turns out to be dead easy if you bother to try.