Maybe a nitpick, but ideally the reinforcement shouldn’t just be based on “behavior”; you want to reward the agent when it does the right thing for the right reasons. Right? (Or maybe you’re defining “cooperative behavior” as not only external behavior but also underlying motivations?)
Maybe a nitpick, but ideally the reinforcement shouldn’t just be based on “behavior”; you want to reward the agent when it does the right thing for the right reasons. Right? (Or maybe you’re defining “cooperative behavior” as not only external behavior but also underlying motivations?)