I’ve been writing about multi-objective RL and trying to figure out a way that an RL agent could optimize for a non-linear sum of objectives in a way that avoids strongly negative outcomes on any particular objective.
https://www.lesswrong.com/posts/i5dLfi6m6FCexReK9/a-brief-review-of-the-reasons-multi-objective-rl-could-be
Thank you! This is addressing the question I was trying to get at. I’ll check it out.
I’ve been writing about multi-objective RL and trying to figure out a way that an RL agent could optimize for a non-linear sum of objectives in a way that avoids strongly negative outcomes on any particular objective.
https://www.lesswrong.com/posts/i5dLfi6m6FCexReK9/a-brief-review-of-the-reasons-multi-objective-rl-could-be
Thank you! This is addressing the question I was trying to get at. I’ll check it out.