The goal of narrow reinforcement learning is to get something-like-human-level behavior using human-level oversight. Optimizing the human value function over short time horizons seems like a fine approach to me.
The difference with broad reinforcement learning is that you aren’t trying to evaluate actions you can’t understand by looking at the consequences you can observe.
The goal of narrow reinforcement learning is to get something-like-human-level behavior using human-level oversight. Optimizing the human value function over short time horizons seems like a fine approach to me.
The difference with broad reinforcement learning is that you aren’t trying to evaluate actions you can’t understand by looking at the consequences you can observe.