I feel like there’s a somewhat common argument about RL not being all that dangerous because it generalizes the training distribution cautiously—being outside the training distribution isn’t going to suddenly cause an RL system to make multi-step plans that are implied but never seen in the training distribution, it’ll probably just fall back on familiar, safe behavior.
To me, these arguments feel like they treat present-day model-free RL as the “central case,” and model-based RL as a small correction.
Anyhow, good post, I like most of the arguments, I just felt my reaction to this particular one could be made in meme format.
I feel like there’s a somewhat common argument about RL not being all that dangerous because it generalizes the training distribution cautiously—being outside the training distribution isn’t going to suddenly cause an RL system to make multi-step plans that are implied but never seen in the training distribution, it’ll probably just fall back on familiar, safe behavior.
To me, these arguments feel like they treat present-day model-free RL as the “central case,” and model-based RL as a small correction.
Anyhow, good post, I like most of the arguments, I just felt my reaction to this particular one could be made in meme format.