lynettebye comments on The shard theory of human values

lynettebye Nov 7, 2022, 11:43 PM
3 points
2
I wasn’t thinking of shards as reward prediction errors, but I can see how the language was confusing. What I meant is that when multiple shards are activated, they affect behavior according to how strongly and reliably they were reinforced in the past. Practically, this looks like competing predictions of reward (because past experience is strongly correlated with predictions of future experience), although technically it’s not a prediction—the shard is just based on the past experience and will influence behavior similarly even if you rationally know the context has changed. E.g. the cake shard will probably still reinforce eating cake even if you know that you just had mouth-changing surgery that means you don’t like cake anymore.
(However, I would expect that shards evolve over time. So in the this example, after enough repetitions reliably failing to reinforce cake eating, the cake shard would eventually stop making you crave cake when you see cake.)
So in my example, cleaner language might be: For example, I more reliably ate cake in the past if someone was currently offering me the slice of cake, compared to someone promising that they will bring a slightly better cake to the office party tomorrow. So when the “someone is currently offering me something” shard and the “someone is promising me something” shard are both activated, the first shard affects my decisions more, because it was rewarded more reliably in the past.
(One test of this theory might be whether people are more likely to take the bigger, later payout if they grew up in extremely reliable environments where they could always count on the adults to follow through on promises. In that case, their “someone is promising me something” shard should have been reinforced similarly to the “someone is currently offering me something” shard. This is basically one explanation given for the classic Marshmallow Experiment—kids waited if they trusted adults to follow through with the promised two marshmallows; kids ate the marshmallow immediately if they didn’t trust adults.)