As always, it simply depends on your utility function.
Please don’t use “utility function” in this context. What you believe you want is different from what you actually want or what you should want or what you would like if it happened, or what you should want to happen irrespective of your own experience (and none of these are utility function in the technical sense), so conflating all these senses into a single rhetorical pseudomath buzzword is bad mental hygiene.
I’m completely baffled by your reply. I have no idea what the “technical sense” of the term “utility function” is, but I thought I was using it the normal, LW way: to refer to an agent’s terminal values.
What term should I use instead? I was under the impression that “utility function” was pretty safe, but apparently it carries some pretty heavy baggage. I’ll gladly switch to using whatever term would prevent this sort of reply in the future. Just let me know.
Or perhaps I simply repeated “utility function” way too many times in that response? I probably should have switched it up a lot more and alternated it with “terminal values”, “goal set”, etc. Using it like 6 times in such a short comment may have been careless and brought it undue attention and scrutiny.
Or… is there something you disagree with in my assessment? I understand that it’s controversial to state that people even have coherent utility functions, or even have terminal values, or whatever, so perhaps my comment takes something for granted that shouldn’t be?
Two more things:
Can you explain how exactly I conflated all those senses into that single word? I thought I used the term to refer to the same exact thing over and over, and I haven’t heard anything to convince me otherwise.
And what exactly does it mean for it to be a “rhetorical psuedomath buzzword”? That sounds like an eloquent attack, but I honestly can’t pinpoint how to interpret it on any higher of a level of detail than you simply reacting to my usage in a disapproving way.
Anyway, do you disagree that somebody could from one moment from the next have a terminal value (or whatever) for avoiding emotional pain at all costs? Or is that wrong or incoherent? Or what?
I’m completely baffled by your reply. I have no idea what the “technical sense” of the term “utility function” is, but I thought I was using it the normal, LW way: to refer to an agent’s terminal values.
Your usage was fine. Some people will try to go all ‘deep’ on you and challenge even the use of the term “terminal values” because “humans aren’t that simple etc”. But that is their baggage not yours and can be safely ignored.
Please don’t use “utility function” in this context.
I probably blatantly reveal my ignorance by asking this, but do only agents who know what they want have a utility-function? An AGI undergoing recursive self-improvement can’t possible know what exactly it is going to “want” later on (some (sub)goals may turn out to be impossible while world states previously believed to be impossible might turn out to be possible), yet it is implied by its given utility-function and the “nature of reality” (environmental circumstances).
What you believe you want is different from what you actually want or what you should want of what you would like if it happened, or what you should want to happen irrespective of your own experience...
You believe that what you want is actually different from what you want. You appear to be knowing that what you believe you want is different from what you actually want. Proof by contradiction that what you believe you want is what you actually want?
Your utility-function seems to assign high utility to world states where it is optimized according to new information. In other words, you believe that your utility-function should be undergoing recursive self-improvement.
I think Nesov’s saying that you have a utility function, but you don’t explicitly know it to the degree that you can make statements about its content. Or at least, it would be more accurate to use the best colloquial term, and leave the term of art “utility function” to its technical meaning.
Also, your penultimate paragraph sounds confused, while the paragraph it’s responding to is confusing but coherent. Nesov’s explicitly listing a variety of related but different categories that “utility function” gets misinterpreted into. He doesn’t claim to believe that what he wants is different from what he wants.
Please don’t use “utility function” in this context. What you believe you want is different from what you actually want or what you should want or what you would like if it happened, or what you should want to happen irrespective of your own experience (and none of these are utility function in the technical sense), so conflating all these senses into a single rhetorical pseudomath buzzword is bad mental hygiene.
I’m completely baffled by your reply. I have no idea what the “technical sense” of the term “utility function” is, but I thought I was using it the normal, LW way: to refer to an agent’s terminal values.
What term should I use instead? I was under the impression that “utility function” was pretty safe, but apparently it carries some pretty heavy baggage. I’ll gladly switch to using whatever term would prevent this sort of reply in the future. Just let me know.
Or perhaps I simply repeated “utility function” way too many times in that response? I probably should have switched it up a lot more and alternated it with “terminal values”, “goal set”, etc. Using it like 6 times in such a short comment may have been careless and brought it undue attention and scrutiny.
Or… is there something you disagree with in my assessment? I understand that it’s controversial to state that people even have coherent utility functions, or even have terminal values, or whatever, so perhaps my comment takes something for granted that shouldn’t be?
Two more things:
Can you explain how exactly I conflated all those senses into that single word? I thought I used the term to refer to the same exact thing over and over, and I haven’t heard anything to convince me otherwise.
And what exactly does it mean for it to be a “rhetorical psuedomath buzzword”? That sounds like an eloquent attack, but I honestly can’t pinpoint how to interpret it on any higher of a level of detail than you simply reacting to my usage in a disapproving way.
Anyway, do you disagree that somebody could from one moment from the next have a terminal value (or whatever) for avoiding emotional pain at all costs? Or is that wrong or incoherent? Or what?
Your usage was fine. Some people will try to go all ‘deep’ on you and challenge even the use of the term “terminal values” because “humans aren’t that simple etc”. But that is their baggage not yours and can be safely ignored.
I probably blatantly reveal my ignorance by asking this, but do only agents who know what they want have a utility-function? An AGI undergoing recursive self-improvement can’t possible know what exactly it is going to “want” later on (some (sub)goals may turn out to be impossible while world states previously believed to be impossible might turn out to be possible), yet it is implied by its given utility-function and the “nature of reality” (environmental circumstances).
You believe that what you want is actually different from what you want. You appear to be knowing that what you believe you want is different from what you actually want. Proof by contradiction that what you believe you want is what you actually want?
Your utility-function seems to assign high utility to world states where it is optimized according to new information. In other words, you believe that your utility-function should be undergoing recursive self-improvement.
I think Nesov’s saying that you have a utility function, but you don’t explicitly know it to the degree that you can make statements about its content. Or at least, it would be more accurate to use the best colloquial term, and leave the term of art “utility function” to its technical meaning.
Also, your penultimate paragraph sounds confused, while the paragraph it’s responding to is confusing but coherent. Nesov’s explicitly listing a variety of related but different categories that “utility function” gets misinterpreted into. He doesn’t claim to believe that what he wants is different from what he wants.
Nope—in theory, all agents have a utility-function—though it might not necessarily be the neatest way of expressing what they value.