I’m curious about what you think people are aware of: that the idea of goal-directedness from the value learning sequence is captured by model-based RL, or that any sufficiently powerful agent (implicitly goal-directed) needs to be model-based instead of model-free?
If that’s the former, I’m really interested in links to posts and comments pointing that out, as I don’t know of any. And if that’s the latter, then it seems that it is goes back to asking whether powerful agents must be goal-directed.
The former (that is, model-based RL-> agent). The latter (smart agent → model-based RL), I think, would be founded on a bit of a level error. At bottom, there are only atoms and the void. Whether something is “really” an agent is a question of how well we can describe this collection of atoms in terms of an agent-shaped model. This is different from the question of what abstractions humans used in the process of programming the AI; Like Rohin says, parts of the agent might be thought of as implicit in the programming, rather than explicit.
Sorry, I don’t know if I can direct you to any explicit sources. If you check out papers like Concrete Problems in AI Safety or others in that genre, though, you’ll see model-based RL used as a simplifying set of assumptions that imply agency.
I’m curious about what you think people are aware of: that the idea of goal-directedness from the value learning sequence is captured by model-based RL, or that any sufficiently powerful agent (implicitly goal-directed) needs to be model-based instead of model-free?
If that’s the former, I’m really interested in links to posts and comments pointing that out, as I don’t know of any. And if that’s the latter, then it seems that it is goes back to asking whether powerful agents must be goal-directed.
The former (that is, model-based RL-> agent). The latter (smart agent → model-based RL), I think, would be founded on a bit of a level error. At bottom, there are only atoms and the void. Whether something is “really” an agent is a question of how well we can describe this collection of atoms in terms of an agent-shaped model. This is different from the question of what abstractions humans used in the process of programming the AI; Like Rohin says, parts of the agent might be thought of as implicit in the programming, rather than explicit.
Sorry, I don’t know if I can direct you to any explicit sources. If you check out papers like Concrete Problems in AI Safety or others in that genre, though, you’ll see model-based RL used as a simplifying set of assumptions that imply agency.