I feel like there’s a difference between “modeling” and “statistical recognition”, in the sense that current (and near-future) AI systems currently don’t necessarily model the world around them.
There is an entire subfield of ML called model-based reinforcement learning.
You’d think that to destroy a world, you first need to have a model of it, but that may not be the case.
Natural selection is existence proof (minus anthropic effects) that you can produce world-altering agents without explicitly using models.
There may be a sense in which generating text and maneuvering the real world are very different.
Well yes, which is why I’m less worried about GPT-3 than EfficientZero.
There may be a sense in which successfully imitating human speech without a “model” or agency is possible.
It is trivially true, and trivially false if you ask the AI adversarial questions that require AGI-completeness.
There may also be such strongly (or even more strongly) binding constraints that prevent even a superintelligent agent from achieving their goals, but which aren’t “defects” in the agent itself, but in some constant in the universe. One such example is the speed of light. However intelligent you are, that’s a physical constraint that you just can’t surpass.
Sure, but one does not need to surpass the speed of light to destroy humanity
There may also be a sense in which AI systems would not self-improve further than required for what we want from them. Meaning, we may fulfill our needs (for which we design and produce AI systems) with a class of AI agents that stop receiving any sort of negative feedback at a certain level of proficiency or ability.
Who is “we”? What is the mechanism by which any AI outside this class will be completely and permanently prevented from coming into existence? This is my criticism for the rest of the points as well. Your strategy for AI risk seems to be “Let’s not build the sort of AI that would destroy the world”, which fails at the first word: “Let’s”.
Your strategy for AI risk seems to be “Let’s not build the sort of AI that would destroy the world”, which fails at the first word: “Let’s”.
I don’t have a strategy, I’m basically just thinking out loud about a couple of specific points. Building a strategy for preventing that type of AI is important, but I don’t (yet?) have any ideas in that area.
Ok, perhaps I was too combative with the wording. My general point is: Don’t think of humanity as a coordinated agent, don’t think of “AGI” as a single tribe with particular properties (I frequently see this same mistake with regard to aliens), and in particular, don’t think because a specific AI won’t be able or want to destroy the world, that therefore the world is saved in general.
There is an entire subfield of ML called model-based reinforcement learning.
Natural selection is existence proof (minus anthropic effects) that you can produce world-altering agents without explicitly using models.
Well yes, which is why I’m less worried about GPT-3 than EfficientZero.
It is trivially true, and trivially false if you ask the AI adversarial questions that require AGI-completeness.
Sure, but one does not need to surpass the speed of light to destroy humanity
Who is “we”? What is the mechanism by which any AI outside this class will be completely and permanently prevented from coming into existence? This is my criticism for the rest of the points as well. Your strategy for AI risk seems to be “Let’s not build the sort of AI that would destroy the world”, which fails at the first word: “Let’s”.
I don’t have a strategy, I’m basically just thinking out loud about a couple of specific points. Building a strategy for preventing that type of AI is important, but I don’t (yet?) have any ideas in that area.
Ok, perhaps I was too combative with the wording. My general point is: Don’t think of humanity as a coordinated agent, don’t think of “AGI” as a single tribe with particular properties (I frequently see this same mistake with regard to aliens), and in particular, don’t think because a specific AI won’t be able or want to destroy the world, that therefore the world is saved in general.