The utility function is an abstraction that does not capture the richness of the behavior of agents in the real world, but the better an agent is at rationality, the more accurate (and comprehensive) the abstraction of the utility function becomes IMHO at describing the agent. I suspect that it never makes sense to model a utility function as changing over time.
Or maybe it makes sense only as a mental shortcut to be used only when we do not have time to make a proper analysis. People of course discover new wants and new preferences as they go through life, but this can be taken into account by saying (or noticing) that the person does not know what their (unchanging) utility function is in every detail, and every now and then he or she learn a previously-unknown detail (or fact) about their utility function.
A baby is not discovering who it is as its mind develops. It is becoming who it will be. This process does not stop before death. At no point can one say, “THIS is who I am” and stop there, imagining that all future change is merely discovering what one already was (despite the new thing being, well, new).
IMHO, utility functions only make sense for “small world” problems: local, well-defined, legible situations for which all possible actions and outcomes are known and complete preferences are possible. For “large worlds” the whole thing falls apart, for multiple reasons which have all been often discussed on LW (although not necessarily with the conclusion that I draw from them). For example, the problems of defining collective utility, self-referential decision theories, non-ergodic decision spaces, game theory with agents reasoning about each other, the observed failures of almost everyone to predict the explosion of various technologies, and the impossibility of limiting the large world to anything less than the whole of one’s future light-cone.
I do not think that any of these will yield merely to “better rationality”.
It’s possible to view utility functions just like probability functions (“probability distributions”), namely as rational restrictions on a subjective state of mind at a particular point in time. Utilities can describe desires, just as probabilities can describe beliefs. That doesn’t cover multi-agent rationality, or diachronic changes over time, but that isn’t much different from probability theory. (Richard Jeffrey’s axiomatization of utility theory is expressed for such a “subjective Bayesian” purpose, but unfortunately it isn’t well known.)
Yeah, when I started studying neuroscience and the genetics of neurons I was kinda mind-blown by just how much change there is throughout the lifetime. There are certain things which are fairly static, like the long-range axons in your brain (aka spanning more than a millimeter). Other things, like the phenotype (the set of expressed genes) and the synapses change from second to second.
Indeed, it caused a bit of a fuss in the neuroscience community when enough evidence was gathered that we had to finally admit that the synapses/dendritic spines in the brain fluctuate too fast and chaotically to be the storage site of learned information that they were long thought to be. Other things may be, such as proteins that remain in place in the cell while the dendritic spine grows and collapses, or certain patterns of gene expression (triggered by reinforced synaptic activity during learning) which code for a propensity to form a synapse in a particular location… we just don’t know at this point.
I used to think similarly, but my friend Max Harms convinced me otherwise. He explained that what he, and others he was in agreement with, meant by ‘utility function’ was not the simple thing written down in the rules of a model doing RL. He meant a much grander thing, the description of a person’s life and behavior as seen from outside time itself by an omniscient observer capable of perfectly simulating the observed agent in infinitely many contexts. A fundamental timeless truth about the state of the universe, not a mere human knowable description.
From this point of view, any utility function that can be written down about some complex real-world agent is likely just a crude approximation of the true utility function, which is potentially too complex to be written in full within the bounds of our observable universe.
I feel like we need two different terms for these two different concepts.
I think you are responding to my “is an abstraction that does not capture the richness” which on reflection I’m not attached to and would not include if I were to rewrite my comment.
Your “seen from outside time” suggests that maybe you agree with my “it never makes sense to model a utility function as changing over time”. In contrast, some on LW hold that a utility function needs to change over time (for human potential to remain “open” or some such). I think that that doesn’t work; i.e., if it changes over time, I think that it is incompatible with the four axioms of VNM-rationality, so these people should switch to some other term than “utility function” for the same reason that someone using the term “point”, “line” or “plane” in a way that is inconsistent with the axioms of geometry should find some other term. (I have no doubt that your “much grander thing” is compatible with the four axioms.)
(In the context of RL, I’m used to hearing it referred to as a reward function.)
Yeah, I’d more say that a utility function can be a description of an agent at a particular point in time, or across the agents entire existence, depending on how you frame it.
Like, for an instance in time (i) where you are evaluating what an agent will do next, there is some mathematical description of what they will do next based on their state of existence and the context that they are in.
If you have several moments in time, you could define such a description for each moment. Indeed, as the agent may change over time, and the context almost certainly does, the utility function couldn’t be static (unless you were referring to the outside-of-time all-timepoints-included utility function).
Does that make sense?
I’m not stating this with much confidence, this doesn’t feel an idea I fully grok, I’m just trying to share what I think I’ve learned and learn from you what you know, since it seems like you’ve thought this out more than I have.
My assertion is that all utility functions (i.e., all functions that satisfy the 4 VNM axioms plus perhaps some additional postulates most of us would agree on) are static (do not change over time).
I should try to prove that, but I’ve been telling myself I should for months now, but haven’t mustered the energy, so am posting the assertion now without proof because an weak argument posted now is better then a perfect argument that might never be posted.
I’ve never been tempted to distinguish between “the outside-of-time all-timepoints-included utility function” and other utility functions such as the utility function referred to by the definition of expected utility (EU (action) = sum over all outcomes of (U(outcome) times p(outcome | action))).
Ok, the static nature of a utility function for a static agent makes sense.
But in the case of humans, or of ML models with online (ongoing) learning, we aren’t static agents.
The continuity of self is an illusion. Every fraction of a second we become a fundamentally different agent. Usually this is only imperceptibly slightly different. The change isn’t a random walk however, it’s based on interactions with the environment and inbuilt algorithms, plus randomness and (in the case of humans) degradation from aging.
Over the span of seconds, this likely has no meaningful impact on the utility function. Over a longer span, like a year, this has a huge impact. Fundamental values can shift. The different agents at those different timepoints surely have different utility functions, don’t they?
The utility function is an abstraction that does not capture the richness of the behavior of agents in the real world, but the better an agent is at rationality, the more accurate (and comprehensive) the abstraction of the utility function becomes IMHO at describing the agent. I suspect that it never makes sense to model a utility function as changing over time.
Or maybe it makes sense only as a mental shortcut to be used only when we do not have time to make a proper analysis. People of course discover new wants and new preferences as they go through life, but this can be taken into account by saying (or noticing) that the person does not know what their (unchanging) utility function is in every detail, and every now and then he or she learn a previously-unknown detail (or fact) about their utility function.
A baby is not discovering who it is as its mind develops. It is becoming who it will be. This process does not stop before death. At no point can one say, “THIS is who I am” and stop there, imagining that all future change is merely discovering what one already was (despite the new thing being, well, new).
IMHO, utility functions only make sense for “small world” problems: local, well-defined, legible situations for which all possible actions and outcomes are known and complete preferences are possible. For “large worlds” the whole thing falls apart, for multiple reasons which have all been often discussed on LW (although not necessarily with the conclusion that I draw from them). For example, the problems of defining collective utility, self-referential decision theories, non-ergodic decision spaces, game theory with agents reasoning about each other, the observed failures of almost everyone to predict the explosion of various technologies, and the impossibility of limiting the large world to anything less than the whole of one’s future light-cone.
I do not think that any of these will yield merely to “better rationality”.
It’s possible to view utility functions just like probability functions (“probability distributions”), namely as rational restrictions on a subjective state of mind at a particular point in time. Utilities can describe desires, just as probabilities can describe beliefs. That doesn’t cover multi-agent rationality, or diachronic changes over time, but that isn’t much different from probability theory. (Richard Jeffrey’s axiomatization of utility theory is expressed for such a “subjective Bayesian” purpose, but unfortunately it isn’t well known.)
Yeah, when I started studying neuroscience and the genetics of neurons I was kinda mind-blown by just how much change there is throughout the lifetime. There are certain things which are fairly static, like the long-range axons in your brain (aka spanning more than a millimeter). Other things, like the phenotype (the set of expressed genes) and the synapses change from second to second.
Indeed, it caused a bit of a fuss in the neuroscience community when enough evidence was gathered that we had to finally admit that the synapses/dendritic spines in the brain fluctuate too fast and chaotically to be the storage site of learned information that they were long thought to be. Other things may be, such as proteins that remain in place in the cell while the dendritic spine grows and collapses, or certain patterns of gene expression (triggered by reinforced synaptic activity during learning) which code for a propensity to form a synapse in a particular location… we just don’t know at this point.
I used to think similarly, but my friend Max Harms convinced me otherwise. He explained that what he, and others he was in agreement with, meant by ‘utility function’ was not the simple thing written down in the rules of a model doing RL. He meant a much grander thing, the description of a person’s life and behavior as seen from outside time itself by an omniscient observer capable of perfectly simulating the observed agent in infinitely many contexts. A fundamental timeless truth about the state of the universe, not a mere human knowable description.
From this point of view, any utility function that can be written down about some complex real-world agent is likely just a crude approximation of the true utility function, which is potentially too complex to be written in full within the bounds of our observable universe.
I feel like we need two different terms for these two different concepts.
I think you are responding to my “is an abstraction that does not capture the richness” which on reflection I’m not attached to and would not include if I were to rewrite my comment.
Your “seen from outside time” suggests that maybe you agree with my “it never makes sense to model a utility function as changing over time”. In contrast, some on LW hold that a utility function needs to change over time (for human potential to remain “open” or some such). I think that that doesn’t work; i.e., if it changes over time, I think that it is incompatible with the four axioms of VNM-rationality, so these people should switch to some other term than “utility function” for the same reason that someone using the term “point”, “line” or “plane” in a way that is inconsistent with the axioms of geometry should find some other term. (I have no doubt that your “much grander thing” is compatible with the four axioms.)
(In the context of RL, I’m used to hearing it referred to as a reward function.)
Yeah, I’d more say that a utility function can be a description of an agent at a particular point in time, or across the agents entire existence, depending on how you frame it.
Like, for an instance in time (i) where you are evaluating what an agent will do next, there is some mathematical description of what they will do next based on their state of existence and the context that they are in.
If you have several moments in time, you could define such a description for each moment. Indeed, as the agent may change over time, and the context almost certainly does, the utility function couldn’t be static (unless you were referring to the outside-of-time all-timepoints-included utility function).
Does that make sense?
I’m not stating this with much confidence, this doesn’t feel an idea I fully grok, I’m just trying to share what I think I’ve learned and learn from you what you know, since it seems like you’ve thought this out more than I have.
My assertion is that all utility functions (i.e., all functions that satisfy the 4 VNM axioms plus perhaps some additional postulates most of us would agree on) are static (do not change over time).
I should try to prove that, but I’ve been telling myself I should for months now, but haven’t mustered the energy, so am posting the assertion now without proof because an weak argument posted now is better then a perfect argument that might never be posted.
I’ve never been tempted to distinguish between “the outside-of-time all-timepoints-included utility function” and other utility functions such as the utility function referred to by the definition of expected utility (EU (action) = sum over all outcomes of (U(outcome) times p(outcome | action))).
Ok, the static nature of a utility function for a static agent makes sense. But in the case of humans, or of ML models with online (ongoing) learning, we aren’t static agents. The continuity of self is an illusion. Every fraction of a second we become a fundamentally different agent. Usually this is only imperceptibly slightly different. The change isn’t a random walk however, it’s based on interactions with the environment and inbuilt algorithms, plus randomness and (in the case of humans) degradation from aging. Over the span of seconds, this likely has no meaningful impact on the utility function. Over a longer span, like a year, this has a huge impact. Fundamental values can shift. The different agents at those different timepoints surely have different utility functions, don’t they?
IMHO, no.