But if there are safety problems in approval, wouldn’t there also be safety problems in the human’s behavior, which imitation learning would copy?
The human’s behavior could be safer because a human mind doesn’t optimize so much as to move outside of the range of inputs where approval is safe, or it has a “proposal generator” that only generates possible actions that with high probability stay within that range.
Similarly, if there are safety problems in the estimation process, wouldn’t there also be safety problems in the prediction of what action a human would take?
Same here, if you just predict what action a human would take, you’re less likely to optimize so much that you likely end up outside of where the estimation process is safe.
I somewhat think that it applies to most imitation learning, not just the online variant of narrow imitation learning, but I am pretty confused/unsure.
Ok, I’d be interested to hear more if you clarify your thoughts.
The human’s behavior could be safer because a human mind doesn’t optimize so much as to move outside of the range of inputs where approval is safe, or it has a “proposal generator” that only generates possible actions that with high probability stay within that range.
Same here, if you just predict what action a human would take, you’re less likely to optimize so much that you likely end up outside of where the estimation process is safe.
Ok, I’d be interested to hear more if you clarify your thoughts.