I’m am not sure if with “paragraph about retargetability” you are attaching a label to the paragraph or expressing specific care about “retargetability”. I’ll assume the latter.
I used the term “retargetable agents” to mean “agents with a defined utility-swap-operation”, because in general an agent may not be defined in a way that makes it clear what does it mean to “change its utility”. So, whenever I invoke comparisons of different utilities on the “same” agent, I want a way to mark this important requirement. I think the term “retargetable agent” is a good choice, I found it in TurnTrout’s sequence, and I think I’m not misusing it even though I use it to mean something a bit different.
Even without cross-utility comparisons, when talking above about different agents with the same utility function, I preferred to say “retargetable agents”, because: what does it mean to say that an agent has a certain utility function, if the agent is not also a perfect Bayesian inductor? If I’m talking about measures of optimization, I probably want to compare “dumb” agents with “smart” agents, and not only in the sense of having “dumb” or “smart” priors. So when I contemplate the dumb agent failing in getting more utility in a way that I can devise but it can’t, shouldn’t I consider it not a utility maximizer? If I want to say that an algorithm is maximizing utility, but stopping short of perfection, at some point I need to be specific about what kind of algorithm I’m talking about. It seems to me that a convenient agnostic thing I could do is considering a class of algorithms which have a “slot” for the utility function.
Related: when Yudkowsky talks about utility maximizers, he doesn’t just say “the superintelligence is a utility maximizer”, he says “the superintelligence is efficient relative to you, fact from which you can make some inferences and not others, etc.”
No, I’m making thoughts up as I argue.
I’m am not sure if with “paragraph about retargetability” you are attaching a label to the paragraph or expressing specific care about “retargetability”. I’ll assume the latter.
I used the term “retargetable agents” to mean “agents with a defined utility-swap-operation”, because in general an agent may not be defined in a way that makes it clear what does it mean to “change its utility”. So, whenever I invoke comparisons of different utilities on the “same” agent, I want a way to mark this important requirement. I think the term “retargetable agent” is a good choice, I found it in TurnTrout’s sequence, and I think I’m not misusing it even though I use it to mean something a bit different.
Even without cross-utility comparisons, when talking above about different agents with the same utility function, I preferred to say “retargetable agents”, because: what does it mean to say that an agent has a certain utility function, if the agent is not also a perfect Bayesian inductor? If I’m talking about measures of optimization, I probably want to compare “dumb” agents with “smart” agents, and not only in the sense of having “dumb” or “smart” priors. So when I contemplate the dumb agent failing in getting more utility in a way that I can devise but it can’t, shouldn’t I consider it not a utility maximizer? If I want to say that an algorithm is maximizing utility, but stopping short of perfection, at some point I need to be specific about what kind of algorithm I’m talking about. It seems to me that a convenient agnostic thing I could do is considering a class of algorithms which have a “slot” for the utility function.
Related: when Yudkowsky talks about utility maximizers, he doesn’t just say “the superintelligence is a utility maximizer”, he says “the superintelligence is efficient relative to you, fact from which you can make some inferences and not others, etc.”