Yeah, so I guess opinions on this would differ depending on how likely people think existential risk from AGI is. Personally, it’s clear to me that agentic misaligned superintelligences are bad news—but I’m much less persuaded by descriptions of how long-term maximising behaviour arises in something like an oracle. The prospect of an AGI that’s much more intelligent than humans and much less agentic seems quite plausible—even, perhaps, in a RL agent.
Yeah, so I guess opinions on this would differ depending on how likely people think existential risk from AGI is. Personally, it’s clear to me that agentic misaligned superintelligences are bad news—but I’m much less persuaded by descriptions of how long-term maximising behaviour arises in something like an oracle. The prospect of an AGI that’s much more intelligent than humans and much less agentic seems quite plausible—even, perhaps, in a RL agent.