Thane Ruthenis comments on Seriously, what goes wrong with “reward the agent when it makes you smile”?

Thane Ruthenis 13 Aug 2022 6:56 UTC
1 point
0
Okay, suppose we feed many environment-states into some factored representation of possible objectives, and generate a lot of (environment, objectives) mappings for a given agent. In your model, is it possible to summarize these results somehow; is it possible to say something general about what the agent is trying to do in all of these environments? (E. g., like my football & chess example.)
- Quintin Pope 13 Aug 2022 7:47 UTC
  2 points
  0
  Parent
  Yes, it’s possible to do summary statistics on the outputted goals, just like you can do summary statistics on the outputs of GPT-3, or in the plans produced by a given search algorithm. That doesn’t make generators of these things have the same type signature as the things themselves.
  
  My counterpoint to John is specifically about the sort of computational structures that can represent goals, while being both simple AND environment/belief-dependent. I’m saying simplicity does not push against representing goals in an environment-dependent way, because your generator of goals can be conditioned on the environment.
  - Thane Ruthenis 13 Aug 2022 9:34 UTC
    2 points
    0
    Parent
    Yes, it’s possible to do summary statistics on the outputted goals
    How “meaningful” would that summary be? Does my “winning at chess vs football” analogy fit what you’re describing, with “winning” being the compressed objective-generator and the actual win conditions of chess/football being the environment-specific objectives?
    - Quintin Pope 15 Aug 2022 6:42 UTC
      3 points
      1
      Parent
      My point is that you can have “goals” (things your search process steers the world towards) and “generators of goals”. These are different things, and you should not use the same name for them.
      
      More specifically, there is a difference in the computational type signature between generators and the things they generate. You can call these two things by whatever label you like, but they are not the same thing.
      
      You can look a person’s plans / behavior in many different games and conclude that it demonstrates a common thread which you might label “winning”. But you should not call the latent cognitive generators responsible for this common thread by the same name you use for the world states the person’s search process steers towards in different environments.
      - Thane Ruthenis 15 Aug 2022 7:28 UTC
        3 points
        0
        Parent
        Alright, then it is a semantics debate from my perspective. I don’t think we’re actually disagreeing, now. Your “objective-generators” cleanly map to my “goals”, and your “objectives” to my “local implementations of goals” (or maybe “values” and “local interpretations of values”). That distinction definitely makes sense at the ground level. In my ontology, it’s a distinction between what you want and how achieving it looks like in a given situation.
        I think it makes more sense to describe it my way, though, since I suspect a continuum of ever-more-specific/local objectives (“winning” as an environment-independent goal, “winning” in this type of game, “winning” against the specific opponent you have, “winning” given this game and opponent and the tactic they’re using), rather than a dichotomy of “objective-generator” vs “objective”, but that’s a finer point.
      - Thane Ruthenis 15 Aug 2022 8:32 UTC
        1 point
        0
        Parent
        Although, digging into the previously-mentioned finer points, I think there is room for some meaningful disagreement.
        I don’t think there are goal-generators as you describe them. I think there are just goals, and then some plan-making/search mechanism which does goal translation/adaptation/interpretation for any given environment the agent is in. I. e., the “goal generators” are separate pieces from the “ur-goals” they take as input.
        And as I’d suggested, there’s a continuum of ever-more specific objectives. In this view, I think the line between “goals” and “plans” blurs, even, so that the most specific “objectives” are just “plans”. In this case, the “goal generator” is just the generic plan-making process working in a particular goal-interpreting regime.
        (Edited-in example: “I want to be a winner” → “I want to win at chess” → “I want to win this game of chess” → “I want to decisively progress towards winning in this turn” → “I want to make this specific move”. The early steps here are clear examples of goal-generation/translation (what does winning mean in chess?), the latter clear examples of problem-solving (how do I do well this turn?), but they’re just extreme ends of a continuum.)
        The initial goal-representations from which that process starts could be many things — mathematically-precise environment-independent utility functions, or goals defined over some default environment (as I suspect is the case with humans), or even step-one objective-generators, as you’re suggesting. But the initial representation being an objective-generator itself seems like a weirdly special case, not how this process works in general.