Briefly skimming Christiano’s post, this is actually one of the few/first proposals from someone MIRI related that actually seems to be on the right track (and similar to my own loose plans). Basically it just boils down to learning human utility functions with layers of meta-learning, with generalized RL and IRL.
Briefly skimming Christiano’s post, this is actually one of the few/first proposals from someone MIRI related that actually seems to be on the right track (and similar to my own loose plans). Basically it just boils down to learning human utility functions with layers of meta-learning, with generalized RL and IRL.