Sure. Though learning from verbal descriptions of hypothetical behavior doesn’t seem much harder than learning from actual behavior—they’re both about equally far from “utility function on states of the universe” :-)
I hope so! IRL and CIRL are really nice frameworks for learning from general behavior, and as far as I can tell, learning from verbal behavior requires a simultaneous model of verbal and general behavior, with some extra parts that I don’t understand yet.
Sure. Though learning from verbal descriptions of hypothetical behavior doesn’t seem much harder than learning from actual behavior—they’re both about equally far from “utility function on states of the universe” :-)
I hope so! IRL and CIRL are really nice frameworks for learning from general behavior, and as far as I can tell, learning from verbal behavior requires a simultaneous model of verbal and general behavior, with some extra parts that I don’t understand yet.