Yup, I’m pretty sure people are aware of this :) See also the model of an agent as something with preferences, beliefs, available actions, and a search+decision algorithm that makes it take actions it believes will help its preferences.
But future AI research will require some serious generalizations that are left un-generalized in current methods. A simple gridworld problem might treat the entire grid as a known POMDP and do search over possible series of actions. Obviously the real world isn’t a known POMDP, so suppose that we just call it an unknown POMDP and try to learn it through observation—now all of a sudden, you can’t hand-specify a cost function in terms of the world model anymore, so that needs to be re-evaluated as well.
I’m curious about what you think people are aware of: that the idea of goal-directedness from the value learning sequence is captured by model-based RL, or that any sufficiently powerful agent (implicitly goal-directed) needs to be model-based instead of model-free?
If that’s the former, I’m really interested in links to posts and comments pointing that out, as I don’t know of any. And if that’s the latter, then it seems that it is goes back to asking whether powerful agents must be goal-directed.
The former (that is, model-based RL-> agent). The latter (smart agent → model-based RL), I think, would be founded on a bit of a level error. At bottom, there are only atoms and the void. Whether something is “really” an agent is a question of how well we can describe this collection of atoms in terms of an agent-shaped model. This is different from the question of what abstractions humans used in the process of programming the AI; Like Rohin says, parts of the agent might be thought of as implicit in the programming, rather than explicit.
Sorry, I don’t know if I can direct you to any explicit sources. If you check out papers like Concrete Problems in AI Safety or others in that genre, though, you’ll see model-based RL used as a simplifying set of assumptions that imply agency.
Yup, I’m pretty sure people are aware of this :) See also the model of an agent as something with preferences, beliefs, available actions, and a search+decision algorithm that makes it take actions it believes will help its preferences.
But future AI research will require some serious generalizations that are left un-generalized in current methods. A simple gridworld problem might treat the entire grid as a known POMDP and do search over possible series of actions. Obviously the real world isn’t a known POMDP, so suppose that we just call it an unknown POMDP and try to learn it through observation—now all of a sudden, you can’t hand-specify a cost function in terms of the world model anymore, so that needs to be re-evaluated as well.
I’m curious about what you think people are aware of: that the idea of goal-directedness from the value learning sequence is captured by model-based RL, or that any sufficiently powerful agent (implicitly goal-directed) needs to be model-based instead of model-free?
If that’s the former, I’m really interested in links to posts and comments pointing that out, as I don’t know of any. And if that’s the latter, then it seems that it is goes back to asking whether powerful agents must be goal-directed.
The former (that is, model-based RL-> agent). The latter (smart agent → model-based RL), I think, would be founded on a bit of a level error. At bottom, there are only atoms and the void. Whether something is “really” an agent is a question of how well we can describe this collection of atoms in terms of an agent-shaped model. This is different from the question of what abstractions humans used in the process of programming the AI; Like Rohin says, parts of the agent might be thought of as implicit in the programming, rather than explicit.
Sorry, I don’t know if I can direct you to any explicit sources. If you check out papers like Concrete Problems in AI Safety or others in that genre, though, you’ll see model-based RL used as a simplifying set of assumptions that imply agency.