paulfchristiano comments on My research priorities for AI control

paulfchristiano 6 Dec 2015 21:17 UTC
8 points
Some things can be done by imitation based on our current understanding (and will get better as machine learning improves). The interesting part of the project is figuring out how to do the trickier things, which will require new ideas.

It’s not clear that imitation impairs your ability to generalize to new domains. An RL agent faces the question: in this new domain, how should I behave to receive rewards? It has not been trained in the domain, but must learn to reason about the domain and figure out what policies will work well. An imitation learner faces the question: in this new domain, how would the expert behave / what behavior would they approve of? The two questions seem similar in difficulty, and indeed you could use the same algorithmic ingredients.

It’s also not clear that it’s relevant if a task involves thinking about cost functions, models, simulation, calculation, etc… These are techniques one could apply either to achieve a high reward, or to produce actions the expert would approve of / like / do themselves. You might say that at that point these rich internal behaviors must be guided by some non-trivial internal dynamic. But then we will just have the same discussion a level lower.