It doesn’t matter how far an AI goes in pursuit of its goal(s), it matters if humans get run over as it’s getting going.
We often think AGI is dangerous if it maximizes an unaligned goal. I think this is quite wrong. AGI is dangerous if it pursues an unaligned goal more competently than humans.
The interest in quantilizers (not-quite-maximizers) seems to be a notable product of this confusion. I’m concerned that this was seriously pursued; it seems so obviously confused about the core logic of AI risk.
This objection to AGI risk (if it has sub-agents it won’t be a maximizer) doesn’t make sense. It’s proposing “AGI won’t work”. We’re going to build it to work. Or it’s hoping that competing goals will cancel out just exactly right to keep humans in the game.
AGI is dangerous if it pursues an unaligned goal more competently than humans. [...] It’s proposing “AGI won’t work”.
I’d say it’s proposing something like “minds including AGIs generally aren’t agentic enough to reliably exert significant power on the world”, with an implicit assumption like “minds that look like they have done that have mostly just gotten lucky or benefited from something like lots of built-up cultural heuristics that are only useful in a specific context and would break down in a sufficiently novel situation”.
I agree that even if this was the case, it wouldn’t eliminate the argument for AI risk; even allowing that, AIs could still become more competent than us and eventually, some of them could get lucky too. My impression of the original discussion was that the argument wasn’t meant as an argument against all AI risk, but rather just against hard takeoff-type scenarios depicting a single AI that takes over the world by being supremely intelligent and agentic.
It doesn’t matter how far an AI goes in pursuit of its goal(s), it matters if humans get run over as it’s getting going.
We often think AGI is dangerous if it maximizes an unaligned goal. I think this is quite wrong. AGI is dangerous if it pursues an unaligned goal more competently than humans.
The interest in quantilizers (not-quite-maximizers) seems to be a notable product of this confusion. I’m concerned that this was seriously pursued; it seems so obviously confused about the core logic of AI risk.
This objection to AGI risk (if it has sub-agents it won’t be a maximizer) doesn’t make sense. It’s proposing “AGI won’t work”. We’re going to build it to work. Or it’s hoping that competing goals will cancel out just exactly right to keep humans in the game.
I’d say it’s proposing something like “minds including AGIs generally aren’t agentic enough to reliably exert significant power on the world”, with an implicit assumption like “minds that look like they have done that have mostly just gotten lucky or benefited from something like lots of built-up cultural heuristics that are only useful in a specific context and would break down in a sufficiently novel situation”.
I agree that even if this was the case, it wouldn’t eliminate the argument for AI risk; even allowing that, AIs could still become more competent than us and eventually, some of them could get lucky too. My impression of the original discussion was that the argument wasn’t meant as an argument against all AI risk, but rather just against hard takeoff-type scenarios depicting a single AI that takes over the world by being supremely intelligent and agentic.