I think you’re primarily addressing reward signals or reinforcement signals. These are, by definition, signals that make behavior preceding them more likely in the future. In the mammalian brain, they define what we pursue.
Other emotions are different; back to them later.
The dopamine system appears to play this role in the mammalian brain. It’s somewhat complex, in that new predictions of future rewards seem to be the primary source of reinforcement for humans; for instance, if someone hands me a hundred dollars, I have a new prediction that I’ll eat food, get shelter, or do something that in turn predicts reward; so I’ll repeat whatever behavior preceded that, and I’ll update my predictions for future reward.
For way more than you want to know about how dopamine seems to shape our actions, see my paper Neural mechanisms of human decision-making and the masses of work it references.
Or better yet, read Steve Byrnes’ Intro to brain-like-AGI safety sequence, focusing on the steering subsystem. Then look at his Valence sequence for more on how we pass reward predictions among our “thoughts” (representations of concepts). (IMO, his Valence matches exactly what the dopamine system is known to do for short time tasks, and what it probably does in human complex thought).
So, when you ask people what their goals are, they’re mentioning things that predict reward to them. They’re guesses about what would give a lot of reward signals. The correct answer to “’why do you want that” is “because I think I’d find it really rewarding”. (“I’d really enjoy it” is close but not quite correct, since there’s a difference between wanting and liking in the brain- google that for another headfull).
Now, we can be really wrong about what we’d find rewarding or enjoy. I think we’re usually way off. But that is how we pick goals, and what drives our behavior (along with a bunch of other factors that are less determinative, like what we know about and what happens into our attention).
Other emotions, like fear, anger, etc. are different. They can be thought of as “tilts”′ to our cognitive landscape. Even learning that we’re experiencing them is tricky. That’s why emotional awareness is a subject to learn about, not just something we’re born knowing. We need to learn to “feel the tilt”. Elevated heart rate might signal fear, anger, or excitement; noticing it or finding other cues are necessary to understand how we’re tilted, and how to correct for it if we want to act rationally. Those sorts of emotions “tilt the landscape” of our cognition by making different thoughts and actions more likely, like thoughts of how someone’s actions were unfair or physical attacks when we’re angry.
Wow, thank you so much. This is a lens I totally hadn’t considered.
You can see in the post how I was confused how evolution played a part in “imbuing” material terminal goals into humans. I was like, “but kinetic sculptures were not in the ancestral environment?”
It sounds like rather than imbuing humans with material goals, it has imbued a process by which humans create their own.
I would still define material goals as simply terminal goals which are not defined by some qualia, but it is fascinating that this is what material goals look like in humans.
This also, as you say, makes it harder to distinguish between emotional and material goals in humans, since our material goals are ultimately emotionally derived. In particular, it makes it difficult to distinguish between an instrumental goal to an emotional terminal goal, and a learned material goal created from reinforced prediction of its expected emotional reward.
E.g. the difference between someone wanting a cookie because it will make them feel good, and someone wanting money as a terminal goal because their brain frequently predicted that money would lead to feeling good.
I still make this distinction between material and emotional goals because this isn’t the only way that material goals play out among all agents. For example, my thermostat has simply been directly imbued with the goal of maintaining a temperature. I can also imagine this is how material goals play out in most insects.
Other emotions, like fear, anger, etc. are different. They can be thought of as “tilts”′ to our cognitive landscape. Even learning that we’re experiencing them is tricky. That’s why emotional awareness is a subject to learn about, not just something we’re born knowing. We need to learn to “feel the tilt”. Elevated heart rate might signal fear, anger, or excitement; noticing it or finding other cues are necessary to understand how we’re tilted, and how to correct for it if we want to act rationally. Those sorts of emotions “tilt the landscape” of our cognition by making different thoughts and actions more likely, like thoughts of how someone’s actions were unfair or physical attacks when we’re angry.
This makes a lot of sense. Yeah I was definitely simplifying all emotions to just their qualia effect, without considering their other physiological effects which define them. So I guess in this post when I say “emotion”, I really mean “qualia”.
But I’m pretty sure that predicted reward is pretty synonymous with what we call “values”.
Just to clarify, are you using “reward” here to also mean “positive (or a lack of negative) qualia”. Or is this reinforcement mechanism recursive by which we might learn to value something because of its predicted reward, but that reward is also a learned value.… and so on where the base case is an emotional reward. If so, how deep can it go?
I primarily mean reward in the sense of reinforcement—a functional definition from animal psychology and neuroscience: reinforcement is whatever makes the previous behavior more likely in the future.
But I also mean a positive feeling (qualia if you like, although I find that term too contentious to use much). I think we have a positive feeling when we’re getting a reward (reinforcement), but I’m not sure that all positive feelings work as enforcement. Maybe.
As to how deep can that recursive learning mechanism go: very deep. When people spend time arguing about logic and abstract values online, they’ve gone deep. There’s no limit- until the world intervenes to tell you your chain of predicted-reward inferences has gone off-track. For instance, if that person has lost their job, and they’re cold and hungry, they might track down the (correct) logic that they ascribed too much value to proving people wrong on the internet, and reduce their estimate of its value.
I think you’re primarily addressing reward signals or reinforcement signals. These are, by definition, signals that make behavior preceding them more likely in the future. In the mammalian brain, they define what we pursue.
Other emotions are different; back to them later.
The dopamine system appears to play this role in the mammalian brain. It’s somewhat complex, in that new predictions of future rewards seem to be the primary source of reinforcement for humans; for instance, if someone hands me a hundred dollars, I have a new prediction that I’ll eat food, get shelter, or do something that in turn predicts reward; so I’ll repeat whatever behavior preceded that, and I’ll update my predictions for future reward.
For way more than you want to know about how dopamine seems to shape our actions, see my paper Neural mechanisms of human decision-making and the masses of work it references.
Or better yet, read Steve Byrnes’ Intro to brain-like-AGI safety sequence, focusing on the steering subsystem. Then look at his Valence sequence for more on how we pass reward predictions among our “thoughts” (representations of concepts). (IMO, his Valence matches exactly what the dopamine system is known to do for short time tasks, and what it probably does in human complex thought).
So, when you ask people what their goals are, they’re mentioning things that predict reward to them. They’re guesses about what would give a lot of reward signals. The correct answer to “’why do you want that” is “because I think I’d find it really rewarding”. (“I’d really enjoy it” is close but not quite correct, since there’s a difference between wanting and liking in the brain- google that for another headfull).
Now, we can be really wrong about what we’d find rewarding or enjoy. I think we’re usually way off. But that is how we pick goals, and what drives our behavior (along with a bunch of other factors that are less determinative, like what we know about and what happens into our attention).
Other emotions, like fear, anger, etc. are different. They can be thought of as “tilts”′ to our cognitive landscape. Even learning that we’re experiencing them is tricky. That’s why emotional awareness is a subject to learn about, not just something we’re born knowing. We need to learn to “feel the tilt”. Elevated heart rate might signal fear, anger, or excitement; noticing it or finding other cues are necessary to understand how we’re tilted, and how to correct for it if we want to act rationally. Those sorts of emotions “tilt the landscape” of our cognition by making different thoughts and actions more likely, like thoughts of how someone’s actions were unfair or physical attacks when we’re angry.
See also my post [Human preferences as RL critic values—implications for alignment](https://www.lesswrong.com/posts/HEonwwQLhMB9fqABh/human-preferences-as-rl-critic-values-implications-for). I’m not sure how clear or compelling it is. But I’m pretty sure that predicted reward is pretty synonymous with what we call “values”.
Wow, thank you so much. This is a lens I totally hadn’t considered.
You can see in the post how I was confused how evolution played a part in “imbuing” material terminal goals into humans. I was like, “but kinetic sculptures were not in the ancestral environment?”
It sounds like rather than imbuing humans with material goals, it has imbued a process by which humans create their own.
I would still define material goals as simply terminal goals which are not defined by some qualia, but it is fascinating that this is what material goals look like in humans.
This also, as you say, makes it harder to distinguish between emotional and material goals in humans, since our material goals are ultimately emotionally derived. In particular, it makes it difficult to distinguish between an instrumental goal to an emotional terminal goal, and a learned material goal created from reinforced prediction of its expected emotional reward.
E.g. the difference between someone wanting a cookie because it will make them feel good, and someone wanting money as a terminal goal because their brain frequently predicted that money would lead to feeling good.
I still make this distinction between material and emotional goals because this isn’t the only way that material goals play out among all agents. For example, my thermostat has simply been directly imbued with the goal of maintaining a temperature. I can also imagine this is how material goals play out in most insects.
This makes a lot of sense. Yeah I was definitely simplifying all emotions to just their qualia effect, without considering their other physiological effects which define them. So I guess in this post when I say “emotion”, I really mean “qualia”.
Just to clarify, are you using “reward” here to also mean “positive (or a lack of negative) qualia”. Or is this reinforcement mechanism recursive by which we might learn to value something because of its predicted reward, but that reward is also a learned value.… and so on where the base case is an emotional reward. If so, how deep can it go?
I’m so glad you found that response helpful!
I primarily mean reward in the sense of reinforcement—a functional definition from animal psychology and neuroscience: reinforcement is whatever makes the previous behavior more likely in the future.
But I also mean a positive feeling (qualia if you like, although I find that term too contentious to use much). I think we have a positive feeling when we’re getting a reward (reinforcement), but I’m not sure that all positive feelings work as enforcement. Maybe.
As to how deep can that recursive learning mechanism go: very deep. When people spend time arguing about logic and abstract values online, they’ve gone deep. There’s no limit- until the world intervenes to tell you your chain of predicted-reward inferences has gone off-track. For instance, if that person has lost their job, and they’re cold and hungry, they might track down the (correct) logic that they ascribed too much value to proving people wrong on the internet, and reduce their estimate of its value.