Immitation learning. One-sentence summary: train models on human behaviour (such as monitoring which keys a human presses when in response to what happens on a computer screen); contrast with Strouse.
Reward learning. One-sentence summary: People like CHAI are still looking at reward learning to “reorient the general thrust of AI research towards provably beneficial systems”. (They are also doing a lot of advocacy, like everyone else.)
I question whether this captures the essence of proponent’s hope for either reward learning or imitation learning?
I think that these two can be combined, as they share a fundamental concept: learn the reward function from humans and continue to learn it.
The recursive reward modeling ones are similar. The AI learns the model of the reward function based on human feedback, and continuously updates or refines it.
This is a feature if you want ASI to seek human guidance, even in unfamiliar scenarios.
At the meta level, it provides both instrumental and learned reasons to preserve human life. However, it also presents compelling reasons to modify us, so we don’t hinder its quest for high reward. It may shape or filter us into compliant entities.
I question whether this captures the essence of proponent’s hope for either reward learning or imitation learning?
I think that these two can be combined, as they share a fundamental concept: learn the reward function from humans and continue to learn it.
For instance, some of these imitation learning papers aim to create an uncertain agent, which will consult a human if it is unsure of their preferences.
The recursive reward modeling ones are similar. The AI learns the model of the reward function based on human feedback, and continuously updates or refines it.
This is a feature if you want ASI to seek human guidance, even in unfamiliar scenarios.
At the meta level, it provides both instrumental and learned reasons to preserve human life. However, it also presents compelling reasons to modify us, so we don’t hinder its quest for high reward. It may shape or filter us into compliant entities.