Past Account comments on MDP models are determined by the agent architecture and the environmental dynamics

Past Account 29 May 2021 18:02 UTC
LW: 1 AF: -2
AF
[Deleted]
- TurnTrout 29 May 2021 22:38 UTC
  LW: 7 AF: 8
  AF Parent
  I read your formalism, but I didn’t understand what prompted you to write it. I don’t yet see the connection to my claims.
  If so, I might try to formalize it.
  Yeah, I don’t want you to spend too much time on a bulletproof grounding of your argument, because I’m not yet convinced we’re talking about the same thing.
  In particular, if the argument’s like, “we usually express reward functions in some featurized or abstracted way, and it’s not clear how the abstraction will interact with your theorems” / “we often use different abstractions to express different task objectives”, then that’s something I’ve been thinking about but not what I’m covering here. I’m not considering practical expressibility issues over the encoded MDP: (“That’s also a claim that we can, in theory, specify reward functions which distinguish between 5 googolplex variants of red-ghost-game-over.”)
  If this doesn’t answer your objection—can you give me an english description of a situation where the objection holds? (Let’s taboo ‘model’, because it’s overloaded in this context)
  - Past Account 30 May 2021 13:21 UTC
    LW: -1 AF: -3
    AF Parent
    [Deleted]
    - TurnTrout 30 May 2021 15:35 UTC
      LW: 7 AF: 8
      AF Parent
      I don’t understand your point in this exchange. I was being specific about my usage of model; I meant what I said in the original post, although I noted room for potential confusion in my comment above. However, I don’t know how you’re using the word.
      I don’t use the term model in my previous reply anyway.
      You used the word ‘model’ in both of your prior comments, and so the search-replace yields “state-abstraction-irrelevant abstractions.” Presumably not what you meant?
      I already pointed out a concrete difference: I claim it’s reasonable to say there are three alternatives while you claim there are two alternatives.
      That’s not a “concrete difference.” I don’t know what you mean when you talk about this “third alternative.” You think you have some knockdown argument—that much is clear—but it seems to me like you’re talking about a different consideration entirely. I likewise feel an urge to disengage, but if you’re interested in explaining your idea at some point, message me and we can set up a higher-bandwidth call.
      - Past Account 31 May 2021 20:59 UTC
        1 point
        Parent
        [Deleted]