[note: anti-realist non-Utilitarian here; I don’t believe “utility” is actually a universal measurable thing, nor that it’s comparable across entities (nor across time for any real entity). Consider this my attempt at an ITT on this topic for Utilitarianism]
One possible answer is that it’s true that those emotions are pretty core to most people’s conception of utility (at least most people I’ve discussed it with). But this does NOT mean that the emotions ARE the utility, they’re just an evolved mechanism which points to utility, and not necessarily the only possible mechanism. Goodhart’s Law hits pretty hard if you think of the emotions directly as utility.
Utility itself is an abstraction over the level of satisfaction of goals/preferences about the state of the universe for an entity. Or in some conceptions, the eu-satisfaction of the goals the entity would have if it were fully informed.
>Utility itself is an abstraction over the level of satisfaction of goals/preferences about the state of the universe for an entity.
You can say that a robot toy has a goal of following a light source. Or thermostat has a goal of keeping the room temperature at a certain setting. But I’m yet to hear anyone counting those things towards total utility calculations.
Of course a counterargument would be “but those are not actual goals, those are the goals of humans that set it”, but in this case you’ve just hidden all the references to humans into the word “goal” and are back to square 1.
[note: anti-realist non-Utilitarian here; I don’t believe “utility” is actually a universal measurable thing, nor that it’s comparable across entities (nor across time for any real entity). Consider this my attempt at an ITT on this topic for Utilitarianism]
One possible answer is that it’s true that those emotions are pretty core to most people’s conception of utility (at least most people I’ve discussed it with). But this does NOT mean that the emotions ARE the utility, they’re just an evolved mechanism which points to utility, and not necessarily the only possible mechanism. Goodhart’s Law hits pretty hard if you think of the emotions directly as utility.
Utility itself is an abstraction over the level of satisfaction of goals/preferences about the state of the universe for an entity. Or in some conceptions, the eu-satisfaction of the goals the entity would have if it were fully informed.
>Utility itself is an abstraction over the level of satisfaction of goals/preferences about the state of the universe for an entity.
You can say that a robot toy has a goal of following a light source. Or thermostat has a goal of keeping the room temperature at a certain setting. But I’m yet to hear anyone counting those things towards total utility calculations.
Of course a counterargument would be “but those are not actual goals, those are the goals of humans that set it”, but in this case you’ve just hidden all the references to humans into the word “goal” and are back to square 1.