Jobst Heitzig comments on Goodhart’s Law in Reinforcement Learning

Jobst Heitzig 16 Oct 2023 13:41 UTC
3 points
1
Excellent! I have three questions
1. How would we get to a certain upper bound on $θ$ ?
2. As collisions with the boundary happen exactly when one action’s probability hits zero, it seems the resulting policies are quite large-support, hence quite probabilistic, which might be a problem in itself, making the agent unpredictable. What is your thinking about this?
3. Related to 2., it seems that while your algorithm ensures that expected true return cannot decrease, it might still lead to quite low true returns in individual runs. So do you agree that this type of algorithm is rather a safety ingredient amongst other ingredients, rather that meant to be a sufficient solution to satety?