Rohin Shah comments on Goal-directed = Model-based RL?

Rohin Shah 21 Feb 2020 23:04 UTC
3 points
Since you say that goal-directed behavior is not about having a model or not, is it about the form of the model? Or about the use of the model?
I’m thinking that there may not be any model. Consider for example an agent that solves (simply connected) mazes by implementing the right hand rule: such an agent seems at least somewhat goal-directed, but it’s hard for me to see a model anywhere in this agent.
Would a model-based agent that did not adapt its model when the environment changed be considered as not goal-directed (like the lookup-table agent in your example)?
Yeah, I think that does make it less goal-directed.
- adamShimi 22 Feb 2020 14:39 UTC
  3 points
  Parent
  About the “right hand rule” agent, I feel it depends on whether it is a hard-coded agent or a learning agent. If it is hard-coded, then clearly it doesn’t require a model. But if it learns such a rule, I would assume it was inferred from a learned model of what mazes are.
  For the non-adaptative agent, you say it is less goal-directed; do you see goal-directedness as a continuous spectrum, as a set of zones on this spectrum, or as a binary threshold on this spectrum?
  - Rohin Shah 22 Feb 2020 16:02 UTC
    3 points
    Parent
    About the “right hand rule” agent, I feel it depends on whether it is a hard-coded agent or a learning agent.
    Yes, I meant the hard-coded one. It still seems somewhat goal-directed to me.
    do you see goal-directedness as a continuous spectrum, as a set of zones on this spectrum, or as a binary threshold on this spectrum?
    Oh, definitely a continuous spectrum. (Though I think several people disagree with me on this, and see it more like a binary-ish threshold. Such people often say things like “intelligence and generalization require some sort of search-like cognition”. I don’t understand their views very well.)
    - adamShimi 10 Mar 2020 17:07 UTC
      1 point
      Parent
      Do you have references of posts of those people who think goal-directedness is binary-ish? That would be very useful, thanks. :)
      - Rohin Shah 12 Mar 2020 16:37 UTC
        2 points
        Parent
        Uh, not really. The mesa optimizers sequence sort of comes from this viewpoint, as does this question, but I haven’t really seen any posts arguing for this position.