tailcalled comments on Values Are Real Like Harry Potter

tailcalled 10 Oct 2024 6:22 UTC
10 points
0

Likewise for values and reward: if something physiologically changes my rewards on a long timescale, I may consistently see different values earlier vs later on that long timescale, and it makes sense to interpret that as values changing over time. Aging and pregnancy are classic examples: our bodies give us different reward signals as we grow older, and different reward signals when we have children. Those metaphorical screens show us different values, so it makes sense to treat that as a change in values, as opposed to a change in our beliefs about values.

I feel like this ends up equating value with reward, which is wrong. Consider e.g. Steve Byrnes’ point about salt-starved rats. At first they are negatively rewarded by salt, but later they are positively rewarded by salt, yet rather than modelling them as changing from anti-valuing salt to valuing salt, I find it more insightful for them to always value homeostasis.
- johnswentworth 10 Oct 2024 16:05 UTC
  5 points
  1
  Parent
  The resolution implicit in the post is that there’s a “value change” when the reward before and reward after are not better compressed by viewing them as generated by a single “values”. So in the salt case, insofar as the reward is best compressed by viewing it as always valuing homeostasis, that would be the “true values”. But insofar as the reward is not better compressed by one “values” than by two, there’s a values change.