Potential paper from DM/Stanford for a future newsletter: https://arxiv.org/pdf/1911.00459.pdf
It addresses the problem that an RL agent will delude itself by finding loopholes in a learned reward function.
Thanks!
Potential paper from DM/Stanford for a future newsletter: https://arxiv.org/pdf/1911.00459.pdf
It addresses the problem that an RL agent will delude itself by finding loopholes in a learned reward function.
Thanks!