What else can the utility function as implemented by your hardware depend on besides your qualia, and computations derived from your qualia?
Calling utility functions “wireheading” is a category error. Wireheading is either:
Directly acting on the machinery that implements one’s utility function to trivially satisfy this hardware, i.e. by directly injecting qualia rather than providing the qualia via what they are normally correlated with.
More broadly, altering one’s utility function to one that is trivial to broadly satisfy, such as by reinforcement via 1.
Calling utility functions “wireheading” is a category error.
If you read my original comment, it’s clear that I meant wireheading is having a utility function that depends only on your qualia. Or maybe “choosing to have”.
What else can the utility function as implemented by your hardware depend on besides your qualia, and computations derived from your qualia?
Huh? So you think there’s nothing inside your head except qualia?
Beliefs aren’t qualia. Subconscious information isn’t qualia.
Directly acting on the machinery that implements one’s utility function to trivially satisfy this hardware, i.e. by directly injecting qualia rather than providing the qualia via what they are normally correlated with.
This sounds like a potentially good definition. But I’m unclear then why anyone using utility theory, and that definition, would object to wireheading. If you’ve got a utility function, and you can satisfy it, that’s the thing to do, right? Why does it matter how you satisfy it? You seem to be saying that the hardware implementation isn’t your real utility function, it’s just an implementation of it. As if the utility function stood somewhere outside you.
Huh? So you think there’s nothing inside your head except qualia?
Beliefs aren’t qualia. Subconscious information isn’t qualia.
Beliefs and subconcious information are derived from qualia and the information about the external world that they correlate with, no?
Utility functions are a convenient mathematical description to describe preferences of entities in game theory and some decision theories, when these preferences are consistent. It’s useful as a metaphor for “what we want”, but when used loosely like this, there are troubles.
As applied to humans, this flat-out doesn’t work. Empirically and as a general rule, we’re not consistent, and most of us can readily be money-pumped. We do not have a nice clean module that weighs outcomes and assigns real numbers to them. Nor do we feed outcome weights into a probability weighting module, and then choose the maximum utility. Our values change on reflection. Heck, we’re not even unitary entities. Our consciousness is multi-faceted. There are the left and right brains communicating and negotiating through the corpus callosum. The information immediately accessible to the consciousness, what we identify with, is rather different than the information our subconscious uses. We are a gigantic hack of an intelligence built upon the shifting sands of stimulus-response and reinforcement conditioning. These joints in our selves make it easier to wirehead, and essentially kill our current selves, leaving only animal-level instincts, if that.
But I’m unclear then why anyone using utility theory, and that definition, would object to wireheading. If you’ve got a utility function, and you can satisfy it, that’s the thing to do, right? Why does it matter how you satisfy it? You seem to be saying that the hardware implementation isn’t your real utility function, it’s just an implementation of it. As if the utility function stood somewhere outside you.
There are multiple utility functions running around here. The basic point was that what I consider important now matters to what choices I make now. The fact that I can make the future me have a new utility function, satisfied by wireheading, does not register positively on my current utility function. In fact, because it throws away almost everything I now care about, I am unlikely to do it now. My goals are “satisfy my current utility function”, and are always that, because that’s what we mean by the abstraction of utility function. My goals are not to satisfy what preferences I may later have. My goals are not to change my preferences to be easier to satisfy, because that means my current goals are less likely to be satisfied. If my goals change, than they will have changed, and only then will I choose differently. It’s not that my utility function stands outside of me: my utility function is part of me. Changing it changes me. It so happens that my utility function would be easily changed if I started directly stimulating my reward center. The reward center is not my utility function, though it is part of the implementation of my decision function (which if it were coherent, could be summarized in a utility function, and sets of probabilities). If we wish to identify the reward circuitry of my brain with a utility function, we’ve also got to put a few other utility functions in, and entities having these utility functions that are in a non-zero sum game with the reward circuitry.
Beliefs and subconcious information are derived from qualia and the information about the external world that they correlate with, no?
Not as far as I know, no. You may be equating “qualia” with “percepts”. That’s not right.
The fact that I can make the future me have a new utility function, satisfied by wireheading, does not register positively on my current utility function. In fact, because it throws away almost everything I now care about, I am unlikely to do it now. My goals are “satisfy my current utility function”, and are always that, because that’s what we mean by the abstraction of utility function. My goals are not to satisfy what preferences I may later have.
If that analysis were correct, there would be no difficulty about wireheading. It would simply be an error.
There is a difficulty about wireheading, and I’m trying to talk about it. I’m looking at static situations: Is there something objectively wrong with a person plugged into themselves giving themselves orgasms forever?
The LW community has a consensus that there is something wrong with that. Yet they also have a consensus that there are no objective values. These are inconsistent.
You’re trying to say that wireheading is an error not because the final wirehead state reached is wrong, but because the path from here to there involved an error. That’s not a valid objection, for the reasons you gave in your comment: Humans are messy, and random variation is a natural part of the human hardware and software. And humans have been messy for some time. So if you can become a wirehead by a simple error, many people must already have made that error. And CEV has to incorporate their wirehead preferences equally with everyone else’s.
There’s something inconsistent about saying that human values are good, but the process generating those values is bad.
Not as far as I know, no. You may be equating “qualia” with “percepts”. That’s not right.
Well, I’m still not convinced there is a useful difference, though I see why philosophers would separate the concepts.
There is a difficulty about wireheading, and I’m trying to talk about it. I’m looking at static situations: Is there something objectively wrong with a person plugged into themselves giving themselves orgasms forever?
There is nothing objectively wrong with that, no.
The LW community has a consensus that there is something wrong with that. Yet they also have a consensus that there are no objective values. These are inconsistent.
The LW community has a consensus that there is something wrong with that judged by our current parochial values that we want to maintain. Not objectively wrong, but widely held inter-subjective agreement that lets us cooperate in trying to steer the future away from a course where everyone gets wireheaded.
You’re trying to say that wireheading is an error not because the final wirehead state reached is wrong,
No, I’m saying that the final state is wrong according to my current values. That’s what I mean by wrong: against my current values. Because it is wrong, any path reaching it must have an error in it somewhere.
And humans have been messy for some time. So if you can become a wirehead by a simple error, many people must already have made that error.
We haven’t had the technology to truly wirehead until quite recently, though various addictions can be approximations.
many people must already have made that error. And CEV has to incorporate their wirehead preferences equally with everyone else’s.
Currently, there’s not enough wireheads, or addicts for that matter, to make much of a difference. Those that are wireheads want nothing more than to be wireheads, so I’m not sure that they would effect anything else under CEV. That’s one of the horrors of wireheading—all other values become lost. What we would have to worry about is a proselytizing wirehead, who wishes everyone else would convert. That seems an even harder end-state to reach than a simple wirehead.
Personally, I don’t want CEV applied to the whole human race. I think large swathes of the human race hold values that conflict badly with mine, and still would after perfect reflection. Wireheads would just be a small subset of that.
Personally, I don’t want CEV applied to the whole human race. I think large swathes of the human race hold values that conflict badly with mine, and still would after perfect reflection. Wireheads would just be a small subset of that.
One of my intuitions about about human value is that it is highly diverse, and any extrapolation will be unable to find consensus / coherence in the way desired by CEV. As such, I’ve always thought that the most likely outcome of augmenting human value through the means of successful FAI would be highly diverse subpopulations all continuing to diverge, with a sort of evolutionary pressure for who receives the most resources. Wireheads should be easy to contain under such a scenario, and would leave expansion to the more active groups.
We haven’t had the technology to truly wirehead until quite recently, though various addictions can be approximations.
I was reverting to my meaning of “wireheading”. Sorry about that.
Personally, I don’t want CEV applied to the whole human race. I think large swathes of the human race hold values that conflict badly with mine, and still would after perfect reflection. Wireheads would just be a small subset of that.
We agree on that.
I think one problem with CEV is that, to buy into CEV, you have to buy into this idea you’re pushing that values are completely subjective. This brings up the question of why anyone implementing CEV would want to include anybody else in the subset whose values are being extrapolated. That would be an error.
You could argue that it’s purely pragmatic—the CEVer needs to compromise with the rest of the world to avoid being crushed like a bug. But, hey, the CEVer has an AI on its side.
You could argue that the CEVer’s values include wanting to make other people happy, and believes it can do this by incorporating their values. There are 2 problems with this:
They would be sacrificing a near-infinite expected utility from propagating their values over all time and space, for a relatively infinitessimal one-time gain of happiness on the part of those currently alive here on Earth. So these have to be CEVers with high discounting of the future. Which makes me wonder why they’re interested in CEV.
Choosing the subset of people who manage to develop a friendly AI and set up CEV strongly selects for people who have the perpetuation of values as their dominant value. If someone claims that he will incorporate other peoples’ values in his CEV at the expense of perpetuating his own values because he’s a nice guy, you should expect that he has to date put more effort into being a nice guy than into CEV.
If you’ve got a utility function, and you can satisfy it, that’s the thing to do, right? Why does it matter how you satisfy it? You seem to be saying that the hardware implementation isn’t your real utility function, it’s just an implementation of it. As if the utility function stood somewhere outside you.
I think I see your point: a wireheading utility function would value (1) for providing the reward with less effort, while a nonwireheading utility function would disvalue (1) for providing the reward without the desideratum.
If you think that the notion of “qualia” requires them to be causally isolated from the universe (which is my guess at why you even bring the idea up), then the burden is on you to explain why everyone who discusses consciousness except Daniel Dennett is silly.
In that case, nothing can be said to depend only on the qualia, because anything that depends on them is also indirectly influenced by whatever the qualia themselves depend on.
Are there any independent variables in the real world? Variables are “independent” given a particular analysis.
When you say a function depends only on a set of variables, you mean that you can compute the function given the value of those variables. It doesn’t matter whether those variables are dependent on other variables.
What’s your definition of wireheading?
I didn’t define it as having an arbitrary utility function. I defined it as a utility function that depends only on your qualia.
What else can the utility function as implemented by your hardware depend on besides your qualia, and computations derived from your qualia?
Calling utility functions “wireheading” is a category error. Wireheading is either:
Directly acting on the machinery that implements one’s utility function to trivially satisfy this hardware, i.e. by directly injecting qualia rather than providing the qualia via what they are normally correlated with.
More broadly, altering one’s utility function to one that is trivial to broadly satisfy, such as by reinforcement via 1.
If you read my original comment, it’s clear that I meant wireheading is having a utility function that depends only on your qualia. Or maybe “choosing to have”.
Huh? So you think there’s nothing inside your head except qualia?
Beliefs aren’t qualia. Subconscious information isn’t qualia.
This sounds like a potentially good definition. But I’m unclear then why anyone using utility theory, and that definition, would object to wireheading. If you’ve got a utility function, and you can satisfy it, that’s the thing to do, right? Why does it matter how you satisfy it? You seem to be saying that the hardware implementation isn’t your real utility function, it’s just an implementation of it. As if the utility function stood somewhere outside you.
Beliefs and subconcious information are derived from qualia and the information about the external world that they correlate with, no?
Utility functions are a convenient mathematical description to describe preferences of entities in game theory and some decision theories, when these preferences are consistent. It’s useful as a metaphor for “what we want”, but when used loosely like this, there are troubles.
As applied to humans, this flat-out doesn’t work. Empirically and as a general rule, we’re not consistent, and most of us can readily be money-pumped. We do not have a nice clean module that weighs outcomes and assigns real numbers to them. Nor do we feed outcome weights into a probability weighting module, and then choose the maximum utility. Our values change on reflection. Heck, we’re not even unitary entities. Our consciousness is multi-faceted. There are the left and right brains communicating and negotiating through the corpus callosum. The information immediately accessible to the consciousness, what we identify with, is rather different than the information our subconscious uses. We are a gigantic hack of an intelligence built upon the shifting sands of stimulus-response and reinforcement conditioning. These joints in our selves make it easier to wirehead, and essentially kill our current selves, leaving only animal-level instincts, if that.
There are multiple utility functions running around here. The basic point was that what I consider important now matters to what choices I make now. The fact that I can make the future me have a new utility function, satisfied by wireheading, does not register positively on my current utility function. In fact, because it throws away almost everything I now care about, I am unlikely to do it now. My goals are “satisfy my current utility function”, and are always that, because that’s what we mean by the abstraction of utility function. My goals are not to satisfy what preferences I may later have. My goals are not to change my preferences to be easier to satisfy, because that means my current goals are less likely to be satisfied. If my goals change, than they will have changed, and only then will I choose differently. It’s not that my utility function stands outside of me: my utility function is part of me. Changing it changes me. It so happens that my utility function would be easily changed if I started directly stimulating my reward center. The reward center is not my utility function, though it is part of the implementation of my decision function (which if it were coherent, could be summarized in a utility function, and sets of probabilities). If we wish to identify the reward circuitry of my brain with a utility function, we’ve also got to put a few other utility functions in, and entities having these utility functions that are in a non-zero sum game with the reward circuitry.
Not as far as I know, no. You may be equating “qualia” with “percepts”. That’s not right.
If that analysis were correct, there would be no difficulty about wireheading. It would simply be an error.
There is a difficulty about wireheading, and I’m trying to talk about it. I’m looking at static situations: Is there something objectively wrong with a person plugged into themselves giving themselves orgasms forever?
The LW community has a consensus that there is something wrong with that. Yet they also have a consensus that there are no objective values. These are inconsistent.
You’re trying to say that wireheading is an error not because the final wirehead state reached is wrong, but because the path from here to there involved an error. That’s not a valid objection, for the reasons you gave in your comment: Humans are messy, and random variation is a natural part of the human hardware and software. And humans have been messy for some time. So if you can become a wirehead by a simple error, many people must already have made that error. And CEV has to incorporate their wirehead preferences equally with everyone else’s.
There’s something inconsistent about saying that human values are good, but the process generating those values is bad.
Well, I’m still not convinced there is a useful difference, though I see why philosophers would separate the concepts.
There is nothing objectively wrong with that, no.
The LW community has a consensus that there is something wrong with that judged by our current parochial values that we want to maintain. Not objectively wrong, but widely held inter-subjective agreement that lets us cooperate in trying to steer the future away from a course where everyone gets wireheaded.
No, I’m saying that the final state is wrong according to my current values. That’s what I mean by wrong: against my current values. Because it is wrong, any path reaching it must have an error in it somewhere.
We haven’t had the technology to truly wirehead until quite recently, though various addictions can be approximations.
Currently, there’s not enough wireheads, or addicts for that matter, to make much of a difference. Those that are wireheads want nothing more than to be wireheads, so I’m not sure that they would effect anything else under CEV. That’s one of the horrors of wireheading—all other values become lost. What we would have to worry about is a proselytizing wirehead, who wishes everyone else would convert. That seems an even harder end-state to reach than a simple wirehead.
Personally, I don’t want CEV applied to the whole human race. I think large swathes of the human race hold values that conflict badly with mine, and still would after perfect reflection. Wireheads would just be a small subset of that.
One of my intuitions about about human value is that it is highly diverse, and any extrapolation will be unable to find consensus / coherence in the way desired by CEV. As such, I’ve always thought that the most likely outcome of augmenting human value through the means of successful FAI would be highly diverse subpopulations all continuing to diverge, with a sort of evolutionary pressure for who receives the most resources. Wireheads should be easy to contain under such a scenario, and would leave expansion to the more active groups.
I was reverting to my meaning of “wireheading”. Sorry about that.
We agree on that.
I think one problem with CEV is that, to buy into CEV, you have to buy into this idea you’re pushing that values are completely subjective. This brings up the question of why anyone implementing CEV would want to include anybody else in the subset whose values are being extrapolated. That would be an error.
You could argue that it’s purely pragmatic—the CEVer needs to compromise with the rest of the world to avoid being crushed like a bug. But, hey, the CEVer has an AI on its side.
You could argue that the CEVer’s values include wanting to make other people happy, and believes it can do this by incorporating their values. There are 2 problems with this:
They would be sacrificing a near-infinite expected utility from propagating their values over all time and space, for a relatively infinitessimal one-time gain of happiness on the part of those currently alive here on Earth. So these have to be CEVers with high discounting of the future. Which makes me wonder why they’re interested in CEV.
Choosing the subset of people who manage to develop a friendly AI and set up CEV strongly selects for people who have the perpetuation of values as their dominant value. If someone claims that he will incorporate other peoples’ values in his CEV at the expense of perpetuating his own values because he’s a nice guy, you should expect that he has to date put more effort into being a nice guy than into CEV.
I think I see your point: a wireheading utility function would value (1) for providing the reward with less effort, while a nonwireheading utility function would disvalue (1) for providing the reward without the desideratum.
You should define ‘qualia,’ then, in such a way that makes it clear how they’re causally isolated from the rest of the universe.
I didn’t say they were causally isolated.
If you think that the notion of “qualia” requires them to be causally isolated from the universe (which is my guess at why you even bring the idea up), then the burden is on you to explain why everyone who discusses consciousness except Daniel Dennett is silly.
In that case, nothing can be said to depend only on the qualia, because anything that depends on them is also indirectly influenced by whatever the qualia themselves depend on.
When you say a function depends only on a set of variables, you mean that you can compute the function given the value of those variables.
Emotional responses aren’t independent variables, they’re functions of past and present sensory input.
Are there any independent variables in the real world? Variables are “independent” given a particular analysis.
When you say a function depends only on a set of variables, you mean that you can compute the function given the value of those variables. It doesn’t matter whether those variables are dependent on other variables.