Johannes Treutlein comments on Non-myopia stories

Johannes Treutlein 17 Nov 2023 20:09 UTC
3 points
0
I found this clarifying for my own thinking! Just a small additional point, in Hidden Incentives for Auto-Induced Distributional Shift, there is also the example of a Q learner that learns to sometimes take a non-myopic action (I believe cooperating with its past self in a prisoner’s dilemma), without any meta learning.
- lberglund 18 Nov 2023 1:50 UTC
  1 point
  0
  Parent
  Thanks for pointing this out! I will make a note of that in the main post.