Overall, it sounds to me like people are confusing their feelings about (predicted) states of the world with caring about states directly.
But aren’t you just setting up a system that values states of the world based on the feelings they contain? How does that make any more sense?
You’re arguing as though neurological reward maximization is the obvious goal to fall back to if other goals aren’t specified coherently. But people have filled in that blank with all sorts of things. “Nothing matters, so let’s do X” goes in all sorts of zany directions.
You’re arguing as though neurological reward maximization is the obvious goal to fall back to if other goals aren’t specified coherently.
I’m not. My thought process isn’t “there aren’t any real values, so let’s go with rewards”; it’s not intended as a hack to fix value nihilism.
Rewards already do matter. It describes people’s behavior well (see PCT) and makes introspective sense. I can actually feel projected and real rewards come up and how decisions arise based on that. I don’t know how “I value that there are many sentients” or any other external referent could come up. It would still be judged on the emotional reaction it causes (but not always in a fully conscious manner).
I think I can imagine agents that actually care about external referents and that wouldn’t wirehead. I just don’t think humans are such agents and I don’t see evidence to the contrary. For example, many humans have no problem with “fake” experiences, like “railroaded, specifically crafted puzzles to stimulate learning” (e.g. Portal 2), “insights that feel profound, but don’t mean anything” (e.g. entheogens) and so on. Pretty much the whole entertainment industry could be called wireheading lite.
But aren’t you just setting up a system that values states of the world based on the feelings they contain? How does that make any more sense?
Acting based on the feelings one will experience is something that already happens, so optimizing for it is sensible. (Not-wireheaded utopias would also optimize them after all, just not only them.)
A major problem I see with acting based on propositions about the world outside one’s mind is that it would assign different value to states that one can’t experimentally distinguish (successful mindless wallpaper vs. actual sentients, any decision after being memory-wiped, etc.). I can always tell if I’m wireheaded, however. I’d invoke Occam’s Razor here and ignore any proposal that generates no anticipated experiences.
Acting based on the feelings one will experience is something that already happens, so optimizing for it is sensible
I can’t really pick apart your logic here, because there isn’t any. This is like saying “buying cheese is something that already happens, so optimizing for it is sensible”
We already know that rewards and punishments influence our actions. Any utopia would try to satisfy them. Even in a complex optimized universe full of un-wireheaded sentients caring about external referents, people would want to avoid pain, … and experience lots of excitement, … . Wireheading just says, that’s all humans care about, so there’s no need for all these constraints, let’s pick the obvious shortcut.
In support of this view, I gave the example of the entertainment industry that optimizes said experiences, but is completely fake (and trying to become more fake) and how many humans react positively to that. They don’t complain that there’s something missing, but rather enjoy those improved experiences more than the existent externally referenced alternatives.
Also, take the reversed experience machine, in which the majority of students asked would stay plugged in. If they had complex preferences as typically cited against wireheading, wouldn’t they have immediately rejected it? An expected paperclip maximizer would have left the machine right away. It can’t build any paperclips there, so the machine has no value to it. But the reversed experience machine seems to have plenty of value for humans.
This is essentially an outside view argument against complex preferences. What’s the evidence that they actually exist? That people care about reality, about referents, all that? When presented with options that don’t fulfill any of this, lots of people still seem to choose them.
So, when people pick chocolate, it illustrates that that’s what they truly desire, and when they pick vanilla, it just means that they’re confused and really they like chocolate but they don’t know it.
Rewards already do matter. It describes people’s behavior well (see PCT) and makes introspective sense.
PCT is not good to cite in this connection. PCT does not speak of rewards. According to PCT, behaviour is performed in order to control perceptions, i.e. to maintain those perceptions at their reference levels.
While it is possible for a control system to be organised around maximising something labelled a reward (or minimising something labelled a penalty), that is just one particular class of possible ways of making a control system. Unless one has specifically observed that organisation, there are no grounds for concluding that reward is involved just because something is made of control systems.
Good point, I oversimplified here. I will consider this in more detail, but it naively, isn’t this irrelevant in terms of wireheading? Maintaining perceptions is maybe a bit trickier to do, but there would still be obvious shortcuts. Maybe if these perceptions couldn’t be simplified in any relevant way, then we’d need at least a full-on matrix and that would disqualify wireheading.
But aren’t you just setting up a system that values states of the world based on the feelings they contain? How does that make any more sense?
You’re arguing as though neurological reward maximization is the obvious goal to fall back to if other goals aren’t specified coherently. But people have filled in that blank with all sorts of things. “Nothing matters, so let’s do X” goes in all sorts of zany directions.
I’m not. My thought process isn’t “there aren’t any real values, so let’s go with rewards”; it’s not intended as a hack to fix value nihilism.
Rewards already do matter. It describes people’s behavior well (see PCT) and makes introspective sense. I can actually feel projected and real rewards come up and how decisions arise based on that. I don’t know how “I value that there are many sentients” or any other external referent could come up. It would still be judged on the emotional reaction it causes (but not always in a fully conscious manner).
I think I can imagine agents that actually care about external referents and that wouldn’t wirehead. I just don’t think humans are such agents and I don’t see evidence to the contrary. For example, many humans have no problem with “fake” experiences, like “railroaded, specifically crafted puzzles to stimulate learning” (e.g. Portal 2), “insights that feel profound, but don’t mean anything” (e.g. entheogens) and so on. Pretty much the whole entertainment industry could be called wireheading lite.
Acting based on the feelings one will experience is something that already happens, so optimizing for it is sensible. (Not-wireheaded utopias would also optimize them after all, just not only them.)
A major problem I see with acting based on propositions about the world outside one’s mind is that it would assign different value to states that one can’t experimentally distinguish (successful mindless wallpaper vs. actual sentients, any decision after being memory-wiped, etc.). I can always tell if I’m wireheaded, however. I’d invoke Occam’s Razor here and ignore any proposal that generates no anticipated experiences.
I can’t really pick apart your logic here, because there isn’t any. This is like saying “buying cheese is something that already happens, so optimizing for it is sensible”
Not really. Let me try to clarify what I meant.
We already know that rewards and punishments influence our actions. Any utopia would try to satisfy them. Even in a complex optimized universe full of un-wireheaded sentients caring about external referents, people would want to avoid pain, … and experience lots of excitement, … . Wireheading just says, that’s all humans care about, so there’s no need for all these constraints, let’s pick the obvious shortcut.
In support of this view, I gave the example of the entertainment industry that optimizes said experiences, but is completely fake (and trying to become more fake) and how many humans react positively to that. They don’t complain that there’s something missing, but rather enjoy those improved experiences more than the existent externally referenced alternatives.
Also, take the reversed experience machine, in which the majority of students asked would stay plugged in. If they had complex preferences as typically cited against wireheading, wouldn’t they have immediately rejected it? An expected paperclip maximizer would have left the machine right away. It can’t build any paperclips there, so the machine has no value to it. But the reversed experience machine seems to have plenty of value for humans.
This is essentially an outside view argument against complex preferences. What’s the evidence that they actually exist? That people care about reality, about referents, all that? When presented with options that don’t fulfill any of this, lots of people still seem to choose them.
So, when people pick chocolate, it illustrates that that’s what they truly desire, and when they pick vanilla, it just means that they’re confused and really they like chocolate but they don’t know it.
Absolutely. Sudoku has been described as “a denial of service attack on human intellect”, and see also the seventh quote here.
PCT is not good to cite in this connection. PCT does not speak of rewards. According to PCT, behaviour is performed in order to control perceptions, i.e. to maintain those perceptions at their reference levels.
While it is possible for a control system to be organised around maximising something labelled a reward (or minimising something labelled a penalty), that is just one particular class of possible ways of making a control system. Unless one has specifically observed that organisation, there are no grounds for concluding that reward is involved just because something is made of control systems.
Good point, I oversimplified here. I will consider this in more detail, but it naively, isn’t this irrelevant in terms of wireheading? Maintaining perceptions is maybe a bit trickier to do, but there would still be obvious shortcuts. Maybe if these perceptions couldn’t be simplified in any relevant way, then we’d need at least a full-on matrix and that would disqualify wireheading.