I like this frame, but I’d like a much better grasp on the “how do we distinguish changes in beliefs vs values” and “how do we distinguish reliable from unreliable data.”
The more central problem is: at least some of the time, it’s possible for someone to say things to me that nudges me in some values-seeming direction, in ways that when I look back, I’m not sure whether or not I endorse.
Some case studies:
I am born into a situation where, as a very young child, I happen to have peers that have a puritan work ethic. I end up orienting myself such that I get feel-good reward signals for doing dutiful work. (vs, in an alternate world, I have some STEM-y parents who encourage me to solve puzzles and get an ‘aha! insight!’ signal that trains me value intellectual challenge, and who maybe meanwhile actively discourage me from puritan-style work in favor of unschool-y self exploration.)
I happen to move into a community where people have different political opinions, and mysteriously I find myself adopting those political opinions in a few years.
Someone makes subtly-wrong arguments at me (maybe intentionally, maybe not), and those lead me to start rehearsing statements about my goals or beliefs that lead me to get some reward signals that are based on falsehood. (This one at least seems sort of obvious – you handle this with ordinary epistemics and be like “well, your epistemics weren’t good enough but that’s more of a problem with epistemics than a flaw in this model of values)
I like this frame, but I’d like a much better grasp on the “how do we distinguish changes in beliefs vs values” and “how do we distinguish reliable from unreliable data.”
The more central problem is: at least some of the time, it’s possible for someone to say things to me that nudges me in some values-seeming direction, in ways that when I look back, I’m not sure whether or not I endorse.
Some case studies:
I am born into a situation where, as a very young child, I happen to have peers that have a puritan work ethic. I end up orienting myself such that I get feel-good reward signals for doing dutiful work. (vs, in an alternate world, I have some STEM-y parents who encourage me to solve puzzles and get an ‘aha! insight!’ signal that trains me value intellectual challenge, and who maybe meanwhile actively discourage me from puritan-style work in favor of unschool-y self exploration.)
I happen to move into a community where people have different political opinions, and mysteriously I find myself adopting those political opinions in a few years.
Someone makes subtly-wrong arguments at me (maybe intentionally, maybe not), and those lead me to start rehearsing statements about my goals or beliefs that lead me to get some reward signals that are based on falsehood. (This one at least seems sort of obvious – you handle this with ordinary epistemics and be like “well, your epistemics weren’t good enough but that’s more of a problem with epistemics than a flaw in this model of values)