mattmacdermott comments on Ayn Rand’s model of “living money”; and an upside of burnout

mattmacdermott 3 Dec 2024 8:16 UTC
1 point
0
Not sure. I guess you also have to exclude policy gradient methods that make use of learned value estimates. “Learned evaluation vs sampled evaluation” is one way you could say it.

Model-based vs model-free does feel quite appropriate, shame it’s used for a narrower kind of model in RL. Not sure if it’s used in your sense in other contexts.