In this categorisation scheme, the wirehead problem involves getting utility directly—while the ponography problem involves getting utility by manipulating sensory inputs. This corresponds to Nozick’s experience machine, or Ring and Orseau’s delusion box.
Calling the umbrella category “wireheading” leaves you with the problem of what to call these subcategories.
You might be right. I thought about this too, but it seemed people on LW had already categorized the experience machine as wireheading. If we rebrand, we should maybe say “self-delusion” instead of “pornography problem”; I really like the term “utility counterfeiting” though and the example about counterfeit money in your essay.
“Utility counterfeiting” is a memorable term; but I wonder if we need a duller, less loaded expression to avoid prejudging the issue? After all, neuropathic pain isn’t any less bad because it doesn’t play any signalling role for the organism. Indeed, in some ways neuropathic pain is worse. We can’t sensibly call it counterfeit or inauthentic. So why is bliss that doesn’t serve any signalling function any less good or authentic? Provocatively expressed, evolution has been driven by the creation of ever more sophisticated counterfeit utilities that tend to promote the inclusive fitness of our genes. Thus e.g. wealth, power, status, maximum access to seemingly intrinsically sexy women of prime reproductive potential (etc) can seem inherently valuable to us. Therefore we want the real thing. This is an unsettling perspective because we like to think we value e.g. our friends for who they are rather than their capacity to trigger subjectively valuable endogenous opioid release in our CNS. But a mechanistic explanation might suggest otherwise.
Bill Hibbard apparently endorses using the wirehead terminology to refer to utility counterfeiting via sense data manipulation here. However, after looking at my proposal, I think it is fairly clear that the “wireheading” term should be reserved for the “simpleton gambit” of Ring and Orseau.
I don’t think my proposal represented a “rebranding”.
I do think you really have to invoke pornography or masturbation to describe the issue.
I think “delusion” is the wrong word. A delusion is a belief held with conviction—despite evidence to the contrary. Masturbation or pornography do not require delusions.
I do think you really have to invoke pornography or masturbation to describe the issue.
I don’t think that pornography and masturbation are good examples, because they aren’t actually generating counterfeit utility for the persons using them. People want to have real sex, true, but that is a manifestation of a more general desire to have pleasurable sexual experiences. Genuine sex satisfies these desires best of all, but pornography and masturbation are both less effective, but still valid, ways of satisfying this desire. The utility they generate is totally real.
What pornography and masturbation are generating counterfeit utility for is natural selection, providing you are modelling natural selection as an agent with a utility function (I’m assuming you are). Obviously natural selection “wants” people to have sex, so from its metaphorical “point of view” pornography and masturbation are counterfeit utility. But human beings don’t care about what natural selection “wants” so the utility is totally real for them.
Wireheading, as I understand it from this essay, is when an agent does something that does not maximize it’s utility function, but instead maximizes a crude approximation of its function. Pornography and masturbation, by contrast, are an instance where an agent is maximizing its genuine utility function. The illusion that they are similar to wireheading comes from confusing the utility function of those agents’ creator (natural selection) with the utility function of the agents themselves. Obviously humans and natural selection have different utility functions.
In the AI case, the AI is performing exactly as it was defined, in an internally unified way; the ideals by which it is called ‘wireheaded’ are only the intentions and ideals of the human programmers.
If you replace “AI” with “Human Beings” and “human programmers” with “natural selection” then he is making the same point you are.
The illusion that they are similar to wireheading comes from confusing the utility function of those agents’ creator (natural selection) with the utility function of the agents themselves.
This isn’t looking at things from nature’s point of view, especially. The point is that pornography and masturbation are forms of sensory stimulation that mimic the desired real world outcomes (finding a mate) without actually leading towards them. If you ignore what natural selection wants, and just consider what people say they want, pornography and masturbation still look like reasonable examples of counterfeit utility to me.
Anyway, if you don’t like my examples, the real issue is whether you can think of better terminology.
The point is that pornography and masturbation are forms of sensory stimulation that mimic the desired real world outcomes (finding a mate) without actually leading towards them.
Humans do desire finding a mate. However, they also value sexual pleasure and looking at naked people as ends in themselves. Finding a mate and having sex with them is obviously the ideal outcome since it satisfies both of those values at the same time. But pornography and masturbation are better than nothing, they satisfy one of those values.
If you ignore what natural selection wants, and just consider what people say they want, pornography and masturbation still look like reasonable examples of counterfeit utility to me.
People say they wish they could have sex with a mate instead of having to masturbate to porn. But that doesn’t mean they don’t value porn or masturbation, it just means that sex with a mate is even more valuable. They aren’t fooling themselves, they’re just satisfying their desire in a less effective manner, because they lack access to more efficient means.
Anyway, if you don’t like my examples, the real issue is whether you can think of better terminology.
Your examples are terrific when discussing the problems an agent with a utility function has when it is trying to create another agent and imbue it with the same utility function. I think that was the point of your essay.
Wireheading is kind of like this. Wireheading is when an agent simplifies its utility function for easier computation and then continues to follow the simplified version even in instances where it seriously conflicts with the real utility function. I don’t think pornography is an example of this, because most people will drop pornography immediately if they get a chance at real sex. This indicates pornography is probably a less efficient way at obtaining the values that sex obtains, rather than a form of wire-heading.
The point is that pornography and masturbation are forms of sensory stimulation that mimic the desired real world outcomes (finding a mate) without actually leading towards them.
Humans do desire finding a mate. However, they also value sexual pleasure and looking at naked people as ends in themselves. Finding a mate and having sex with them is obviously the ideal outcome since it satisfies both of those values at the same time. But pornography and masturbation are better than nothing, they satisfy one of those values.
I think you could say that about practically any example. You could say that people watching Friends are fulfilling some of their values by learning about social interaction—rather than just feeding themselves a fake social life in which they have really funny quirky friends. You could say that ladies with cute dogs are fulfilling their desire to love and be loved—rather than creating a fake baby to satisfy their maternal instincts. We won’t find a perfect example, we just want a pretty good one.
Wireheading is kind of like this. Wireheading is when an agent simplifies its utility function for easier computation and then continues to follow the simplified version even in instances where it seriously conflicts with the real utility function. I don’t think pornography is an example of this [...]
most people will drop pornography immediately if they get a chance at real sex. This indicates pornography is probably a less efficient way at obtaining the values that sex obtains, rather than a form of wire-heading.
Unwillingness to replace the fake simulation with the real thing (if it is freely available) isn’t really a feature of the pornography problem. The real thing may well be better than the fake simulation. That doesn’t represent a problem with the example, but rather is a widespread feature of the phenomenon being characterized.
@timtyler, you have made a nice point about taxonomy—also noting your comment re Hibbard below.
I suggest classifying like this:
Agents that maximize a utility register, a memory location that can be hijacked (as Utilitron; something similar happened with Eurisko).
Agents that maximize an internally-calculated utility function of either input (observations) or of world-model. Agents that maximize a function of the input stream can hijack that input stream or any point in the pipeline of calculations that produces this number. Drugs and electrical wireheading relate to this.
Agents that maximize a reward provided from the outside, whether from the creator or the the environment at large. The reward function may be unknown to the agent. These agents can hijack the reward stream.
All these are distinct from:
Wireheading in humans, which as Eliezer points out, results from different desires of different mental parts.
Paperclippers, which could naively be seen as wireheading if we falsely liken its simplistic behavior to a human who is satisfying a simple pleasure sensation as opposed to a more complex value system: “Why are you going wild with stimulating your cravings for making paperclips, like humans who overeat, rather than considering more deeply what would be the right thing do?”
My 2011 “Utility counterfeiting” essay categorises the area a little differently:
It has “utility counterfeiting” as the umbrella category—and “the wirehead problem” and “the pornography problem” as sub-categories.
In this categorisation scheme, the wirehead problem involves getting utility directly—while the ponography problem involves getting utility by manipulating sensory inputs. This corresponds to Nozick’s experience machine, or Ring and Orseau’s delusion box.
Calling the umbrella category “wireheading” leaves you with the problem of what to call these subcategories.
You might be right. I thought about this too, but it seemed people on LW had already categorized the experience machine as wireheading. If we rebrand, we should maybe say “self-delusion” instead of “pornography problem”; I really like the term “utility counterfeiting” though and the example about counterfeit money in your essay.
“Utility counterfeiting” is a memorable term; but I wonder if we need a duller, less loaded expression to avoid prejudging the issue? After all, neuropathic pain isn’t any less bad because it doesn’t play any signalling role for the organism. Indeed, in some ways neuropathic pain is worse. We can’t sensibly call it counterfeit or inauthentic. So why is bliss that doesn’t serve any signalling function any less good or authentic? Provocatively expressed, evolution has been driven by the creation of ever more sophisticated counterfeit utilities that tend to promote the inclusive fitness of our genes. Thus e.g. wealth, power, status, maximum access to seemingly intrinsically sexy women of prime reproductive potential (etc) can seem inherently valuable to us. Therefore we want the real thing. This is an unsettling perspective because we like to think we value e.g. our friends for who they are rather than their capacity to trigger subjectively valuable endogenous opioid release in our CNS. But a mechanistic explanation might suggest otherwise.
Bill Hibbard apparently endorses using the wirehead terminology to refer to utility counterfeiting via sense data manipulation here. However, after looking at my proposal, I think it is fairly clear that the “wireheading” term should be reserved for the “simpleton gambit” of Ring and Orseau.
I don’t think my proposal represented a “rebranding”.
I do think you really have to invoke pornography or masturbation to describe the issue.
I think “delusion” is the wrong word. A delusion is a belief held with conviction—despite evidence to the contrary. Masturbation or pornography do not require delusions.
I don’t think that pornography and masturbation are good examples, because they aren’t actually generating counterfeit utility for the persons using them. People want to have real sex, true, but that is a manifestation of a more general desire to have pleasurable sexual experiences. Genuine sex satisfies these desires best of all, but pornography and masturbation are both less effective, but still valid, ways of satisfying this desire. The utility they generate is totally real.
What pornography and masturbation are generating counterfeit utility for is natural selection, providing you are modelling natural selection as an agent with a utility function (I’m assuming you are). Obviously natural selection “wants” people to have sex, so from its metaphorical “point of view” pornography and masturbation are counterfeit utility. But human beings don’t care about what natural selection “wants” so the utility is totally real for them.
Wireheading, as I understand it from this essay, is when an agent does something that does not maximize it’s utility function, but instead maximizes a crude approximation of its function. Pornography and masturbation, by contrast, are an instance where an agent is maximizing its genuine utility function. The illusion that they are similar to wireheading comes from confusing the utility function of those agents’ creator (natural selection) with the utility function of the agents themselves. Obviously humans and natural selection have different utility functions.
Eliezer put it well in his comment when he said:
If you replace “AI” with “Human Beings” and “human programmers” with “natural selection” then he is making the same point you are.
This isn’t looking at things from nature’s point of view, especially. The point is that pornography and masturbation are forms of sensory stimulation that mimic the desired real world outcomes (finding a mate) without actually leading towards them. If you ignore what natural selection wants, and just consider what people say they want, pornography and masturbation still look like reasonable examples of counterfeit utility to me.
Anyway, if you don’t like my examples, the real issue is whether you can think of better terminology.
Humans do desire finding a mate. However, they also value sexual pleasure and looking at naked people as ends in themselves. Finding a mate and having sex with them is obviously the ideal outcome since it satisfies both of those values at the same time. But pornography and masturbation are better than nothing, they satisfy one of those values.
People say they wish they could have sex with a mate instead of having to masturbate to porn. But that doesn’t mean they don’t value porn or masturbation, it just means that sex with a mate is even more valuable. They aren’t fooling themselves, they’re just satisfying their desire in a less effective manner, because they lack access to more efficient means.
Your examples are terrific when discussing the problems an agent with a utility function has when it is trying to create another agent and imbue it with the same utility function. I think that was the point of your essay.
Wireheading is kind of like this. Wireheading is when an agent simplifies its utility function for easier computation and then continues to follow the simplified version even in instances where it seriously conflicts with the real utility function. I don’t think pornography is an example of this, because most people will drop pornography immediately if they get a chance at real sex. This indicates pornography is probably a less efficient way at obtaining the values that sex obtains, rather than a form of wire-heading.
I think you could say that about practically any example. You could say that people watching Friends are fulfilling some of their values by learning about social interaction—rather than just feeding themselves a fake social life in which they have really funny quirky friends. You could say that ladies with cute dogs are fulfilling their desire to love and be loved—rather than creating a fake baby to satisfy their maternal instincts. We won’t find a perfect example, we just want a pretty good one.
Me neither. I was trying to characterise the pornography problem - not the wirehead problem.
Unwillingness to replace the fake simulation with the real thing (if it is freely available) isn’t really a feature of the pornography problem. The real thing may well be better than the fake simulation. That doesn’t represent a problem with the example, but rather is a widespread feature of the phenomenon being characterized.
I agree, your term is much more descriptive and less susceptible to conflation with the other terms.
Anja, nice post.
@timtyler, you have made a nice point about taxonomy—also noting your comment re Hibbard below.
I suggest classifying like this:
Agents that maximize a utility register, a memory location that can be hijacked (as Utilitron; something similar happened with Eurisko).
Agents that maximize an internally-calculated utility function of either input (observations) or of world-model. Agents that maximize a function of the input stream can hijack that input stream or any point in the pipeline of calculations that produces this number. Drugs and electrical wireheading relate to this.
Agents that maximize a reward provided from the outside, whether from the creator or the the environment at large. The reward function may be unknown to the agent. These agents can hijack the reward stream.
All these are distinct from:
Wireheading in humans, which as Eliezer points out, results from different desires of different mental parts.
Paperclippers, which could naively be seen as wireheading if we falsely liken its simplistic behavior to a human who is satisfying a simple pleasure sensation as opposed to a more complex value system: “Why are you going wild with stimulating your cravings for making paperclips, like humans who overeat, rather than considering more deeply what would be the right thing do?”