I’m not sure if these kind of comments are acceptable on this site, but I just wanted to say thank you for this sequence. I doubt I will significantly change my life after reading this, but I hope to change it at least a little in this direction.
Viewing myself as a reinforcement learning agent that balances policy improvement (taking my present model and thinking about how to tweak my actions to optimize rewards assuming my model is correct) and exploration (observing how the world actually responds to certain actions to update the model), I have historically spent far too much time on policy improvement.
This sequence provides a nice set of guidelines and methods to pivot gears and really think about what it even means to improve ones model of the world, in a way that seems… fun? fulfilling? I hope to report back on this in a few months and say how it’s gone; there is a high probability that I fall back into old habits, but I hope I do not.
I’m not sure if these kind of comments are acceptable on this site, but I just wanted to say thank you for this sequence. I doubt I will significantly change my life after reading this, but I hope to change it at least a little in this direction.
Viewing myself as a reinforcement learning agent that balances policy improvement (taking my present model and thinking about how to tweak my actions to optimize rewards assuming my model is correct) and exploration (observing how the world actually responds to certain actions to update the model), I have historically spent far too much time on policy improvement.
This sequence provides a nice set of guidelines and methods to pivot gears and really think about what it even means to improve ones model of the world, in a way that seems… fun? fulfilling? I hope to report back on this in a few months and say how it’s gone; there is a high probability that I fall back into old habits, but I hope I do not.