Chantiel comments on The Gears of Impact

Chantiel 29 Jan 2021 21:03 UTC
3 points
Thanks for the link. It turns out I missed some of the articles in the sequence. Sorry for misunderstanding your ideas.
I thought about it, and I don’t think your agent would have the issue I described.
Now, if the reward function was learned using something like a universal prior, then other agents might be able to hijack the learned reward function to make the AI misbehave. But that concern is already known.