My sense is that Stuart assuming there’s an initial-specified reward function is a simplification, not a key part of the plan, and that he’d also be interested in e.g. generalizing a reward function learned from other sources of human feedback like preference comparison.
IRD would do well on this problem because it has an explicit distribution over possible reward functions, but this isn’t really that unique to IRD—Bayesian IRL or preference comparison would have the same property.
(I don’t think we have experience with deep Bayesian versions of IRL / preference comparison at CHAI, and I was thinking about advice on who to talk to)
My sense is that Stuart assuming there’s an initial-specified reward function is a simplification, not a key part of the plan, and that he’d also be interested in e.g. generalizing a reward function learned from other sources of human feedback like preference comparison.
IRD would do well on this problem because it has an explicit distribution over possible reward functions, but this isn’t really that unique to IRD—Bayesian IRL or preference comparison would have the same property.
Yeah, I agree with that.
(I don’t think we have experience with deep Bayesian versions of IRL / preference comparison at CHAI, and I was thinking about advice on who to talk to)