There’s two uses of ‘utility function’. One is analogous to Daniel Dennett’s “intentional stance” in that you can choose to interpret an entity as having a utility function—this is always possible but not necessarily a perspicuous way of understanding an entity—because you might end up with utility functions like “enjoys running in circles but is equally happy being prevented from running in circles”.
The second form is as an explicit component within an AI design. Tool-AIs do not contain such a component—they might have a relevance or accuracy function for evaluating answers, but it’s not a utility function over the world.
because you might end up with utility functions like “enjoys running in circles but is equally happy being prevented from running in circles”.
Is that a problem so long as some behaviors are preferred over others? You could have “is neutral about running in circles, but resists jumping up and down and prefers making abstract paintings”.
Tool-AIs do not contain such a component—they might have a relevance or accuracy function for evaluating answers, but it’s not a utility function over the world.
Wouldn’t that depend on the Tool-AI? Eliezer’s default no-akrasia AI does everything it can to fulfill its utility function. You presumably want it to be as accurate as possible or perhaps as accurate as useful. Would it be a problem for it to ask for more resources? To earn money on its own initiative for more resources? To lobby to get laws passed to give it more resources? At some point, it’s a problem if it’s going to try to rule the world to get more resources.....
Tool-AIs do not contain such a component—they might have a relevance or accuracy function for evaluating answers, but it’s not a utility function over the world.
Wouldn’t that depend on the Tool-AI?
I think this is explicitly part of the “Tool-AI” definition, that it is not a Utility Maximizer.
I think there’s thorough confusion between utilityA: utility as used in economics to try and predict humans (and predict them inaccurately), and the utilityB: utility as in the model based agent, where the utility is a mathematical function which takes in description of the world and which only refers to real world items if you read stuff into it that is not there and can not be put there.
Viciously maximizing some utilityB leads to, given sufficient capability, the vicious and ohh so dangerous modification of the inputs to utilityB function, i.e. wireheading.
The AIs as we know them, agents or tools, are not utilityA maximizers. We do not know how to make utilityA maximizer. The human intelligence also doesn’t seem to work as utilityA maximizer. It is likely the case that utilityA maximizer is a logical impossibility for agents embedded in the world, or at very least, requires very major advances in formalization of philosophy.
It is likely the case that utilityA maximizer is a logical impossibility for agents embedded in the world...
Very interesting and relevant! Can you elaborate or link? I think the case can be made based on Arrow’s theorem and its corollaries, but I’m not sure that’s what you have in mind.
There’s two uses of ‘utility function’. One is analogous to Daniel Dennett’s “intentional stance” in that you can choose to interpret an entity as having a utility function—this is always possible but not necessarily a perspicuous way of understanding an entity—because you might end up with utility functions like “enjoys running in circles but is equally happy being prevented from running in circles”.
The second form is as an explicit component within an AI design. Tool-AIs do not contain such a component—they might have a relevance or accuracy function for evaluating answers, but it’s not a utility function over the world.
Is that a problem so long as some behaviors are preferred over others? You could have “is neutral about running in circles, but resists jumping up and down and prefers making abstract paintings”.
Wouldn’t that depend on the Tool-AI? Eliezer’s default no-akrasia AI does everything it can to fulfill its utility function. You presumably want it to be as accurate as possible or perhaps as accurate as useful. Would it be a problem for it to ask for more resources? To earn money on its own initiative for more resources? To lobby to get laws passed to give it more resources? At some point, it’s a problem if it’s going to try to rule the world to get more resources.....
I think this is explicitly part of the “Tool-AI” definition, that it is not a Utility Maximizer.
I think there’s thorough confusion between utilityA: utility as used in economics to try and predict humans (and predict them inaccurately), and the utilityB: utility as in the model based agent, where the utility is a mathematical function which takes in description of the world and which only refers to real world items if you read stuff into it that is not there and can not be put there.
Viciously maximizing some utilityB leads to, given sufficient capability, the vicious and ohh so dangerous modification of the inputs to utilityB function, i.e. wireheading.
The AIs as we know them, agents or tools, are not utilityA maximizers. We do not know how to make utilityA maximizer. The human intelligence also doesn’t seem to work as utilityA maximizer. It is likely the case that utilityA maximizer is a logical impossibility for agents embedded in the world, or at very least, requires very major advances in formalization of philosophy.
Very interesting and relevant! Can you elaborate or link? I think the case can be made based on Arrow’s theorem and its corollaries, but I’m not sure that’s what you have in mind.