I don’t understand. What do you mean by averaging two utility functions?
Can you should how the offsets cause trouble when you try to do normalization?
Can you show why we should investigate normalization at all? The question is always “What preference does this scheme correspond to, and why would I want that?”
Suppose I have three hypotheses for the Correct Utility (whatever that means), over three choices (e.g. choice one is a dollar, choice two is an apple, choice three is a hamburger): (1, 2, 3), (0, 500, −1000), and (1010, 1005, 1000). Except of course, they all have some unknown offset ‘c’, and come unknown scale factor ‘k’.
Suppose I take the numbers at face value and just average them weighted by some probabilities (the answer to your first question) - if I think they’re all about equally plausible, the composite utility function is (337, 501, 1). So I like the apple best, then the dollar, then the hamburger.
But what if these utility functions were written down by me while I was in 3 different moods, and I don’t want to just take the numbers at face value? What if I look back and think “man, I really liked using big numbers when I wrote #2, but that’s just an artifact, I didn’t really hate hamburgers 1000 times as much as I liked a dollar when I wrote #1. And I really liked everything when I wrote #3 - but I didn’t actually like a dollar 1000 times more than when I wrote #1, I just gave everything a bonus because I liked everything. Time to normalize!”
First, I try to normalize without taking offsets into account (now we’re starting the answer to your second question). I say “Let’s take function 1 as our scale, and just divide everything down until the biggest absolute value is 3.” Okay then, the functions become (1,2,3), (0, 1.5, −3), (3, 2.985, 2.97). If I then take the weighted average, the composite utility function is (1.3, 2.2, 1.0). Now I still like the apple best, then the hamburger, then the dollar, but maybe this time I like the hamburger almost as much as dollar, and so will make different (more sensible, perhaps) bets than before. There a variety of possible normalizations (normalizing the average, normalizing the absolute value sum, etc), but someone had a post exploring this (was it Stuart Armstrong? I can’t find the post, sorry) and didn’t really find a best choice.
However, there’s a drawback to just scaling everything down—it totally washed out utility function #3′s impact on the final answer. Imagine that I dismissed function #2 (probability = 0) - now whether I like the dollar more than the hamburger or not depends totally on whether or not I scale down function #3.
So I decide to shift everything, then scale it, so I don’t wash out the effect of function 3. So I make the dollar the 0-point for all the functions, then rescale everything so the biggest absolute value is 2. Then the functions become (0,1,2), (0,1,-2), (0, −1, −2). Then I average them to get (0, 1⁄3, −2/3). Yet again, I like the apple best, but again the ratios are different so I’ll make different bets than before.
Hm, I could have chosen better examples. But I’m too lazy to redo the math for better clarity—if you want something more dramatic, imagine function #3 had a larger apparent scale than #2, and so the composite choice shifted from looking like #3 to #2 as you normalized.
I am in total agreement with whatever point it seems like you just made, which seems to be that normalization schemes are madness.
What you “did” there is full of type errors and treating the scales and offsets as significant and whatnot. That is not allowed, and you seemed to be claiming that it is not allowed.
I guess it must be unclear what the point of OP was, though, because I was assuming that such things were not allowed as well.
What I did in the OP was completely decouple things from the arbitrary scale and offset that the utility functions come with by saying we have a utility function U’, and U’ agrees with moral theory m on object level preferences conditional on moral theory m being correct. This gives us an unknown scale and offset for each utility function that masks out the arbitraryness of each utility function’s native scale and shift. Then that scale and shift are to be adjusted so that we get relative utilities at the end that are consistent with whatever preferences we want to have.
I hope that clarifies things? But it probably doesn’t.
What you “did” there is full of type errors and treating the scales and offsets as significant and whatnot. That is not allowed, and you seemed to be claiming that it is not allowed.
Hm. You definitely did communicate that, but I guess maybe I’m pointing out a math mistake—it seems to me that you called the problem of arbitrary offsets solved too early. Though in your example it wasn’t a problem because you only had two outcomes and one outcome was always the zero point.
As I realized later because of Alex, the upshot is that to really deal with the problem of offsets you have to (at least de facto) normalize the relative utilities, not the utilities themselves. (On pain of stupidity)
Though in your example it wasn’t a problem because you only had two outcomes and one outcome was always the zero point.
I think my procedure does not run into trouble even with three options and other offsets. I don’t feel like trying it just now, but if you want to demonstrate how it goes wrong, please do.
the upshot is that to really deal with the problem of offsets you have to (at least de facto) normalize the relative utilities, not the utilities themselves. (On pain of stupidity)
I don’t understand. What do you mean by averaging two utility functions?
Can you should how the offsets cause trouble when you try to do normalization?
Can you show why we should investigate normalization at all? The question is always “What preference does this scheme correspond to, and why would I want that?”
Sure.
Suppose I have three hypotheses for the Correct Utility (whatever that means), over three choices (e.g. choice one is a dollar, choice two is an apple, choice three is a hamburger): (1, 2, 3), (0, 500, −1000), and (1010, 1005, 1000). Except of course, they all have some unknown offset ‘c’, and come unknown scale factor ‘k’.
Suppose I take the numbers at face value and just average them weighted by some probabilities (the answer to your first question) - if I think they’re all about equally plausible, the composite utility function is (337, 501, 1). So I like the apple best, then the dollar, then the hamburger.
But what if these utility functions were written down by me while I was in 3 different moods, and I don’t want to just take the numbers at face value? What if I look back and think “man, I really liked using big numbers when I wrote #2, but that’s just an artifact, I didn’t really hate hamburgers 1000 times as much as I liked a dollar when I wrote #1. And I really liked everything when I wrote #3 - but I didn’t actually like a dollar 1000 times more than when I wrote #1, I just gave everything a bonus because I liked everything. Time to normalize!”
First, I try to normalize without taking offsets into account (now we’re starting the answer to your second question). I say “Let’s take function 1 as our scale, and just divide everything down until the biggest absolute value is 3.” Okay then, the functions become (1,2,3), (0, 1.5, −3), (3, 2.985, 2.97). If I then take the weighted average, the composite utility function is (1.3, 2.2, 1.0). Now I still like the apple best, then the hamburger, then the dollar, but maybe this time I like the hamburger almost as much as dollar, and so will make different (more sensible, perhaps) bets than before. There a variety of possible normalizations (normalizing the average, normalizing the absolute value sum, etc), but someone had a post exploring this (was it Stuart Armstrong? I can’t find the post, sorry) and didn’t really find a best choice.
However, there’s a drawback to just scaling everything down—it totally washed out utility function #3′s impact on the final answer. Imagine that I dismissed function #2 (probability = 0) - now whether I like the dollar more than the hamburger or not depends totally on whether or not I scale down function #3.
So I decide to shift everything, then scale it, so I don’t wash out the effect of function 3. So I make the dollar the 0-point for all the functions, then rescale everything so the biggest absolute value is 2. Then the functions become (0,1,2), (0,1,-2), (0, −1, −2). Then I average them to get (0, 1⁄3, −2/3). Yet again, I like the apple best, but again the ratios are different so I’ll make different bets than before.
Hm, I could have chosen better examples. But I’m too lazy to redo the math for better clarity—if you want something more dramatic, imagine function #3 had a larger apparent scale than #2, and so the composite choice shifted from looking like #3 to #2 as you normalized.
I am in total agreement with whatever point it seems like you just made, which seems to be that normalization schemes are madness.
What you “did” there is full of type errors and treating the scales and offsets as significant and whatnot. That is not allowed, and you seemed to be claiming that it is not allowed.
I guess it must be unclear what the point of OP was, though, because I was assuming that such things were not allowed as well.
What I did in the OP was completely decouple things from the arbitrary scale and offset that the utility functions come with by saying we have a utility function U’, and U’ agrees with moral theory m on object level preferences conditional on moral theory m being correct. This gives us an unknown scale and offset for each utility function that masks out the arbitraryness of each utility function’s native scale and shift. Then that scale and shift are to be adjusted so that we get relative utilities at the end that are consistent with whatever preferences we want to have.
I hope that clarifies things? But it probably doesn’t.
Hm. You definitely did communicate that, but I guess maybe I’m pointing out a math mistake—it seems to me that you called the problem of arbitrary offsets solved too early. Though in your example it wasn’t a problem because you only had two outcomes and one outcome was always the zero point.
As I realized later because of Alex, the upshot is that to really deal with the problem of offsets you have to (at least de facto) normalize the relative utilities, not the utilities themselves. (On pain of stupidity)
I think my procedure does not run into trouble even with three options and other offsets. I don’t feel like trying it just now, but if you want to demonstrate how it goes wrong, please do.
I don’t understand what you are saying here.