Probably the easy utility function makes agent 1 have more optimisation power. I agree this means comparisons between different utility functions can be unfair, but not sure why that rules out a measure which is invariant under positive affine transformations of a particular utility function?
why that rules out a measure which is invariant under positive affine transformations of a particular utility function?
Ah, ok. I assumed you wanted the invariance to make cross-utility comparisons. From your question I infer you want it instead for within-utility comparisons.
The reason I assumed the former is that I don’t consider invariance useful for within-utility comparisons. Like utility is a scale that compares preferences, you want a scale on which to rate retargetable agents optimizing the same utility. The particular units should not matter.
In the case of utility, you can renormalize it in [0,1] if it’s bounded from above and below. An important example where this is not possible is the log scoring rule for predictions, i.e., an agent that outputs probabilities for unobserved events with utility logPrediction probability(actual outcome) will output its actual probability assignments. You can define “actual probability assignments” operatively as the probabilities the agent would use to choose the optimal action for any utility function, assuming the agent is retargetable. Does this suggest something about the feasibility of standardizing any optimization score? Maybe it’s not possible in general too? Maybe it’s possible under useful restrictions to the utility? For example, of the little I know about Infra-Bayesianism, there’s that they use utilities in [0,1].
I’m am not sure if with “paragraph about retargetability” you are attaching a label to the paragraph or expressing specific care about “retargetability”. I’ll assume the latter.
I used the term “retargetable agents” to mean “agents with a defined utility-swap-operation”, because in general an agent may not be defined in a way that makes it clear what does it mean to “change its utility”. So, whenever I invoke comparisons of different utilities on the “same” agent, I want a way to mark this important requirement. I think the term “retargetable agent” is a good choice, I found it in TurnTrout’s sequence, and I think I’m not misusing it even though I use it to mean something a bit different.
Even without cross-utility comparisons, when talking above about different agents with the same utility function, I preferred to say “retargetable agents”, because: what does it mean to say that an agent has a certain utility function, if the agent is not also a perfect Bayesian inductor? If I’m talking about measures of optimization, I probably want to compare “dumb” agents with “smart” agents, and not only in the sense of having “dumb” or “smart” priors. So when I contemplate the dumb agent failing in getting more utility in a way that I can devise but it can’t, shouldn’t I consider it not a utility maximizer? If I want to say that an algorithm is maximizing utility, but stopping short of perfection, at some point I need to be specific about what kind of algorithm I’m talking about. It seems to me that a convenient agnostic thing I could do is considering a class of algorithms which have a “slot” for the utility function.
Related: when Yudkowsky talks about utility maximizers, he doesn’t just say “the superintelligence is a utility maximizer”, he says “the superintelligence is efficient relative to you, fact from which you can make some inferences and not others, etc.”
Probably the easy utility function makes agent 1 have more optimisation power. I agree this means comparisons between different utility functions can be unfair, but not sure why that rules out a measure which is invariant under positive affine transformations of a particular utility function?
Ah, ok. I assumed you wanted the invariance to make cross-utility comparisons. From your question I infer you want it instead for within-utility comparisons.
The reason I assumed the former is that I don’t consider invariance useful for within-utility comparisons. Like utility is a scale that compares preferences, you want a scale on which to rate retargetable agents optimizing the same utility. The particular units should not matter.
In the case of utility, you can renormalize it in [0,1] if it’s bounded from above and below. An important example where this is not possible is the log scoring rule for predictions, i.e., an agent that outputs probabilities for unobserved events with utility logPrediction probability(actual outcome) will output its actual probability assignments. You can define “actual probability assignments” operatively as the probabilities the agent would use to choose the optimal action for any utility function, assuming the agent is retargetable. Does this suggest something about the feasibility of standardizing any optimization score? Maybe it’s not possible in general too? Maybe it’s possible under useful restrictions to the utility? For example, of the little I know about Infra-Bayesianism, there’s that they use utilities in [0,1].
Your last paragraph about retargetability sounds quite interesting. Do you have a reference for this story?
No, I’m making thoughts up as I argue.
I’m am not sure if with “paragraph about retargetability” you are attaching a label to the paragraph or expressing specific care about “retargetability”. I’ll assume the latter.
I used the term “retargetable agents” to mean “agents with a defined utility-swap-operation”, because in general an agent may not be defined in a way that makes it clear what does it mean to “change its utility”. So, whenever I invoke comparisons of different utilities on the “same” agent, I want a way to mark this important requirement. I think the term “retargetable agent” is a good choice, I found it in TurnTrout’s sequence, and I think I’m not misusing it even though I use it to mean something a bit different.
Even without cross-utility comparisons, when talking above about different agents with the same utility function, I preferred to say “retargetable agents”, because: what does it mean to say that an agent has a certain utility function, if the agent is not also a perfect Bayesian inductor? If I’m talking about measures of optimization, I probably want to compare “dumb” agents with “smart” agents, and not only in the sense of having “dumb” or “smart” priors. So when I contemplate the dumb agent failing in getting more utility in a way that I can devise but it can’t, shouldn’t I consider it not a utility maximizer? If I want to say that an algorithm is maximizing utility, but stopping short of perfection, at some point I need to be specific about what kind of algorithm I’m talking about. It seems to me that a convenient agnostic thing I could do is considering a class of algorithms which have a “slot” for the utility function.
Related: when Yudkowsky talks about utility maximizers, he doesn’t just say “the superintelligence is a utility maximizer”, he says “the superintelligence is efficient relative to you, fact from which you can make some inferences and not others, etc.”