It seems like we’re mostly on the same page with this proposal. Probably something that’s going on is that the notion of “episodic RL” in my head is quite broad, to the point where it includes things like taking into account an ever-expanding history (each episode is “do the right thing in the next round, given the history”). But at that point it’s probably better to use a different formalism, such as the one you describe.
My objection was the one you acknowledge: “this desideratum seems much too weak for FAI”.
It seems like we’re mostly on the same page with this proposal. Probably something that’s going on is that the notion of “episodic RL” in my head is quite broad, to the point where it includes things like taking into account an ever-expanding history (each episode is “do the right thing in the next round, given the history”). But at that point it’s probably better to use a different formalism, such as the one you describe.
My objection was the one you acknowledge: “this desideratum seems much too weak for FAI”.