Good post, I actually hold a similar-ish views myself.
However, I’d be interested if you elaborated more on the last paragraph—what specific examples of that kind of research do know about/recommend to check out?
There are also some links in https://old.reddit.com/r/reinforcementlearning/comments/9pwy2f/wbe_and_drl_a_middle_way_of_imitation_learning/
For example, I mentioned 1, 2, 3 earlier in the article… That should get you started but I’m happy to discuss more. :-)
Good post, I actually hold a similar-ish views myself.
However, I’d be interested if you elaborated more on the last paragraph—what specific examples of that kind of research do know about/recommend to check out?
There are also some links in https://old.reddit.com/r/reinforcementlearning/comments/9pwy2f/wbe_and_drl_a_middle_way_of_imitation_learning/
For example, I mentioned 1, 2, 3 earlier in the article… That should get you started but I’m happy to discuss more. :-)