Consider a very simple model where the world has just two variables, represented by real numbers: cultural values (c) and the other variable (x). Our utility function is U(c, x)=c*x, which is clearly constant over time. However, our preferred value of x will strongly depend on cultural values: if c is negative, we want to minimize x, while if c is positive, we want to maximize x.
This model is so simple that it behaves quite strangely (e.g. it says you want to pick cultural values that view the current state of the world favorably), but it shows that by adding complexity to your utility function, you can make it depend on many things without actually changing over time.
Can you give me an example of this in reality? The math works, but I notice I am still confused, in that values should not just be a variable in utility function… they should in fact change the utility function itself.
If they’re relegated to a variable, that seems to go against the original stated goal of wanting moral progress. in which case the utility function originally was constructed wrong.
values should not just be a variable in utility function
All else being equal for me, I’d rather other people have their values get satisfied. So their values contribute to my utility function. If we model this as their utility contributing to my utility function, then we get mutual recursion, but we can also model this as each utility function having a direct and an indirect component, where the indirect components are aggregations of the direct components of other people’s utility functions, avoiding the recursion.
If they’re relegated to a variable, that seems to go against the original stated goal of wanting moral progress.
To be more specific, people can value society’s values coming more closely in line with their own values, or their own values coming more closely in line with what they would value if they thought about it more, or society’s values moving in the direction they would naturally without the intervention of an AI, etc. Situations in which someone wants their own values to change in a certain way can be modeled as an indirect component to the utility function, as above.
Define the “partial utility function” as how utility changes with x holding c constant (i.e. U(x) at a particular value of c). Changes in values change this partial utility function, but they never change the full utility function U(c,x). A real-world example: if you prefer to vote for the candidate that gets the most votes, then your vote will depend strongly on the other voters’ values, but this preference can still be represented by a single, unchanging utility function.
I don’t understand your second paragraph—why would having values as a variable be bad? It’s certainly possible to change the utility function, but AlexMennen’s point was the future values could still be taken into account even with a static utility function. If the utility function is constant and also depends on current values, then it needs to values into account as an argument (i.e. a variable).
How are you changing the values you optimize for without changing your utility function? This now seems even more handwavey to me.
Consider a very simple model where the world has just two variables, represented by real numbers: cultural values (c) and the other variable (x). Our utility function is U(c, x)=c*x, which is clearly constant over time. However, our preferred value of x will strongly depend on cultural values: if c is negative, we want to minimize x, while if c is positive, we want to maximize x.
This model is so simple that it behaves quite strangely (e.g. it says you want to pick cultural values that view the current state of the world favorably), but it shows that by adding complexity to your utility function, you can make it depend on many things without actually changing over time.
Can you give me an example of this in reality? The math works, but I notice I am still confused, in that values should not just be a variable in utility function… they should in fact change the utility function itself.
If they’re relegated to a variable, that seems to go against the original stated goal of wanting moral progress. in which case the utility function originally was constructed wrong.
All else being equal for me, I’d rather other people have their values get satisfied. So their values contribute to my utility function. If we model this as their utility contributing to my utility function, then we get mutual recursion, but we can also model this as each utility function having a direct and an indirect component, where the indirect components are aggregations of the direct components of other people’s utility functions, avoiding the recursion.
To be more specific, people can value society’s values coming more closely in line with their own values, or their own values coming more closely in line with what they would value if they thought about it more, or society’s values moving in the direction they would naturally without the intervention of an AI, etc. Situations in which someone wants their own values to change in a certain way can be modeled as an indirect component to the utility function, as above.
Define the “partial utility function” as how utility changes with x holding c constant (i.e. U(x) at a particular value of c). Changes in values change this partial utility function, but they never change the full utility function U(c,x). A real-world example: if you prefer to vote for the candidate that gets the most votes, then your vote will depend strongly on the other voters’ values, but this preference can still be represented by a single, unchanging utility function.
I don’t understand your second paragraph—why would having values as a variable be bad? It’s certainly possible to change the utility function, but AlexMennen’s point was the future values could still be taken into account even with a static utility function. If the utility function is constant and also depends on current values, then it needs to values into account as an argument (i.e. a variable).