I have no doubt that you could make a pill that would convince someone that they were living an awesome life, complete with hallucinations of rocket-powered tyrannosaurs, and black leather lab coats.
The trouble is that merely hallucinating those things, or merely feeling awesome is not enough.
The average optimizer probably has no code for experiencing utility, it only feels the utility of actions under consideration. The concept of valuing (or even having) internal experience is particular to humans, and is in fact only one of the many things that we care about. Is there a good argument for why internal experience ought to be the only thing we care about? Why should we forget all the other things that we like and focus solely on internal experience (and possibly altruism)?
Can’t I simulate everything I care about? And if I can, why would I care about what is going on outside of the simulation, any more than I care now about a hypothetical asteroid on which the “true” purpose of the universe is written? Hell, if I can delete the fact from my memory that my utility function is being deceived, I’d gladly do so—yes, it will bring some momentous negative utility, but it would be a teensy bit greatly offset by the gains, especially stretched over a huge amount of time.
Now that I think about it...if, without an awesomeness pill, my decision would be to go and do battle in an eternal Valhalla where I polish my skills and have fun, and an awesomeness pill brings me that, except maybe better in some way I wouldn’t normally have thought of...what is exactly the problem here? The image of a brain with the utility slider moved to the max is disturbing, but I myself can avoid caring about that particular asteroid. An image of an universe tiled with brains storing infinite integers is disturbing; one of an universe tiled with humans riding rocket-powered tyrannosaurs is great—and yet, they’re one and the same; we just can’t intuitively penetrate the black box that is the brain storing the integer. I’d gladly tile the universe with awesome.
If I could take an awesomeness pill and be whisked off somewhere where my body would be taken care of indefinitely, leaving everything else as it is, maybe I would decline; probably I won’t. Luckily, once awesomeness pills become available, there probably won’t be starving children, so that point seems moot.
[PS.] In any case, if my space fleet flies by some billboard saying that all this is an illusion, I’d probably smirk, I’d maybe blow it up with my rainbow lasers, and I’d definitely feel bad about all those other fellas whose space fleets are a bit less awesome and significantly more energy-consuming than mine (provided our AI is still limited by, at the very least, entropy; meaning limited in its ability to tile the world to infinity; if it can create the same amount of real giant robots as it can create awesome pills, it doesn’t matter which option is taken), all just because they’re bothered by silly billboards like this. If I’m allowed to have that knowledge and the resulting negative utility, that is.
[PPS.] I can’t imagine how an awesomeness pill would max my sliders for self-improvement, accomplishment, etc without actually giving me the illusion of doing those things. As in, I can imagine feeling intense pleasure; I can’t imagine feeling intense achievement separated from actually flying—or imagining that I’m flying—a spaceship—it wouldn’t feel as fulfilling, and it makes no sense that an awesomeness pill would separate them if it’s possible not to. It probably wouldn’t have me go through the roundabout process of doing all the stuff, and it probably would max my sliders even if I can’t imagine it, to an effect much different from the roundabout way, and by definition superior. As long as it doesn’t modify my utility function (as long as I value flying space ships), I don’t mind.
Luckily, once awesomeness pills become available, there probably won’t be starving children, so that point seems moot.
This is a key assumption. Sure, if I assume that the universe is such that no choice I make affects the chances that a child I care about will starve—and, more generally, if I assume that no choice I make affects the chances that people will gain good stuff or bad stuff—then sure, why not wirehead? It’s not like there’s anything useful I could be doing instead.
But some people would, in that scenario, object to the state of the world. Some people actually want to be able to affect the total amount of good and bad stuff that people get.
And, sure, the rest of us could get together and lie to them (e.g., by creating a simulation in which they believe that’s the case), though it’s not entirely clear why we ought to. We could also alter them (e.g., by removing their desire to actually do good) but it’s not clear why we ought to do that, either.
I can’t imagine feeling intense achievement separated from actually flying—or imagining that I’m flying—a spaceship
Do you mean to distinguish this from believing that you have flown a spaceship?
Don’t we have to do it (lying to people) because we value other people being happy? I’d rather trick them (or rather, let the AI do so without my knowledge) than have them spend a lot of time angsting about not being able to help anyone because everyone was already helped. (If there are people who can use your help, I’m not about to wirehead you though)
Do you mean to distinguish this from believing that you have flown a spaceship?
Yes. Thinking about simulating achievement got me confused about it. I can imagine intense pleasure or pain. I can’t imagine intense achievement; if I just got the surge of warmth I normally get, it would feel wrong, removed from flying a spaceship. Yet, that doesnt mean that I don’t have an achievement slider to max; it just means I can’t imagine what maxing it indefinitely would feel like. Maxing the slider leading to hallucinations about performing activities related to achievement seems too roundabout—really, that’s the only thing I can say; it feels like it won’t work that way. Can the pill satisfy terminal values without making me think I satisfied them? I think this question shows that the sentence before it is just me being confused. Yet I can’t imagine how an awesomeness pill would feel, hence I can’t dispel this annoying confusion.
[EDIT] Maybe a pill that simply maxes the sliders would make me feel achievement, but without flying a spaceship, hence making it incomplete, hence forcing the AI to include a spaceship hallucinator. I think I am/was making it needlessly complicated. In any case, the general idea is that if we are all opposed to just feeling intense pleasure without all the other stuff we value, then a pill that gives us only intense pleasure is flawed and would not even be given as an option.
Regarding the first bit… well, we have a few basic choices:
Change the world so that reality makes them happy
Change them so that reality makes them happy
Lie to them about reality, so that they’re happy
Accept that they aren’t happy
If I’m understanding your scenario properly, we don’t want to do the first because it leaves more people worse off, and we don’t want to do the last because it leaves us worse off. (Why our valuing other people being happy should be more important than their valuing actually helping people, I don’t know, but I’ll accept that it is.)
But why, on your view, ought we lie to them, rather than change them?
I attach negative utility to getting my utility function changed—I wouldn’t change myself to maximize paperclips. I also attach negative utility to getting my memory modified—I don’t like the normal decay that is happening even now, but far worse is getting a large swath of my memory wiped. I also dislike being fed negative information, but that is by far the least negative of the three, provided no negative consequences arise from the false belief. Hence, I’d prefer being fed negative information to having my memory modified to being made to stop caring about other people altogether. There is an especially big gap between the last one and the former two.
Thanks for summarizing my argument. I guess I need to work on expressing myself so I don’t force other people to work through my roundaboutness :)
Fair enough. If you have any insight into why your preferences rank in this way, I’d be interested, but I accept that they are what they are.
However, I’m now confused about your claim.
Are you saying that we ought to treat other people in accordance with your preferences of how to be treated (e.g., lied to in the present rather than having their values changed or their memories altered)? Or are you just talking about how you’d like us to treat you? Or are you assuming that other people have the same preferences you do?
For the preference ranking, I guess I can mathematically express it by saying that any priority change leads to me doing stuff that would be utility+ at the time, but utility- or utilityNeutral (and since I could be spending the time generating utility+ instead, even neutral is bad) now. For example, if I could change my utility function to eating babies, and babies were plentiful, this option would result in a huge source of utility+ after the change. Which doesn’t change the fact that it also means I’d eat a ton of babies, which makes the option a huge source of utility- currently—I wouldn’t want to do something that would lead to me eating a ton of babies. If I greatly valued generating as much utility+ for myself at any moment as possible, I would take the plunge; however, I look at the future, decide not to take what is currently utility- for me, and move on. Or maybe I’m just making up excuses to refuse to take a momentary discomfort for eternal utility+ - after all, I bet someone having the time of his life eating babies would laugh at me and have more fun than me—the inconsistency here is that I avoid the utility- choice when it comes to changing my terminal values, but I have no issue taking the utility- choice when I decide I want to be in a simulation. Guess I don’t value truth that much. I find that changing my memories leads to similar results as changing my utility function, but on a much, much smaller scale—after all, they are what make up my beliefs, preferences, myself as a person. Changing them at all changes my belief system and preference; but that’s happening all the time. Changing them on a large scale is significantly worse in terms of affecting my utility function—it can’t change my terminal values, so still far less bad than directly making me interested in eating babies, but still negative. Getting lied to is just bad, with no relation to the above two, and weakest in importance.
My gut says that I should treat others as I want them to treat me. Provided a simulation is a bit more awesome, or comparably awesome but more efficient, I’d rather take that than the real thing. Hence, I’d want to give others what I myself prefer (in terms of ways to achieve preferences) - not because they are certain to agree that being lied to is better than angsting about not helping people, but because my way is either better or worse than theirs, and I wouldn’t believe in my way unless I though it better. Of course, I am also assuming that truth isn’t a terminal value to them. In the same way, since I don’t want my utility function changed, I’d rather not do it to them.
Hell, if I can delete the fact from my memory that my utility function is being deceived, I’d gladly do so—yes, it will bring some momentous negative utility, but it would be a teensy bit greatly offset by the gains, especially stretched over a huge amount of time.
I don’t understand this. If your utility function is being deceived, then you don’t value the true state of affairs, right? Unless you value “my future self feeling utility” as a terminal value, and this outweighs the value of everything else …
No, this is more about deleting a tiny discomfort—say, the fact that I know that all of it is an illusion; I attach a big value to my memory and especially disagree with sweeping changes to it, but I’ll rely on the pill and thereby the AI to make the decision what shouldn’t be deleted because doing so would interfere with the fulfillment of my terminal values and what can be deleted because it brings negative utility that isn’t necessary.
Intellectually, I wouldn’t care whether I’m the only drugged brain in a world where everyone is flying real spaceships. I probably can’t fully deal with the intuition telling me I’m drugged though. It’s not highly important—just a passing discomfort when I think about the particular topic (passing and tiny, unless there are starving children). Whether its worth keeping around so I can feel in control and totally not drugged and imprisoned...I guess that’s reliant on the circumstances.
I have no doubt that you could make a pill that would convince someone that they were living an awesome life, complete with hallucinations of rocket-powered tyrannosaurs, and black leather lab coats.
The trouble is that merely hallucinating those things, or merely feeling awesome is not enough.
The average optimizer probably has no code for experiencing utility, it only feels the utility of actions under consideration. The concept of valuing (or even having) internal experience is particular to humans, and is in fact only one of the many things that we care about. Is there a good argument for why internal experience ought to be the only thing we care about? Why should we forget all the other things that we like and focus solely on internal experience (and possibly altruism)?
Can’t I simulate everything I care about? And if I can, why would I care about what is going on outside of the simulation, any more than I care now about a hypothetical asteroid on which the “true” purpose of the universe is written? Hell, if I can delete the fact from my memory that my utility function is being deceived, I’d gladly do so—yes, it will bring some momentous negative utility, but it would be a teensy bit greatly offset by the gains, especially stretched over a huge amount of time.
Now that I think about it...if, without an awesomeness pill, my decision would be to go and do battle in an eternal Valhalla where I polish my skills and have fun, and an awesomeness pill brings me that, except maybe better in some way I wouldn’t normally have thought of...what is exactly the problem here? The image of a brain with the utility slider moved to the max is disturbing, but I myself can avoid caring about that particular asteroid. An image of an universe tiled with brains storing infinite integers is disturbing; one of an universe tiled with humans riding rocket-powered tyrannosaurs is great—and yet, they’re one and the same; we just can’t intuitively penetrate the black box that is the brain storing the integer. I’d gladly tile the universe with awesome.
If I could take an awesomeness pill and be whisked off somewhere where my body would be taken care of indefinitely, leaving everything else as it is, maybe I would decline; probably I won’t. Luckily, once awesomeness pills become available, there probably won’t be starving children, so that point seems moot.
[PS.] In any case, if my space fleet flies by some billboard saying that all this is an illusion, I’d probably smirk, I’d maybe blow it up with my rainbow lasers, and I’d definitely feel bad about all those other fellas whose space fleets are a bit less awesome and significantly more energy-consuming than mine (provided our AI is still limited by, at the very least, entropy; meaning limited in its ability to tile the world to infinity; if it can create the same amount of real giant robots as it can create awesome pills, it doesn’t matter which option is taken), all just because they’re bothered by silly billboards like this. If I’m allowed to have that knowledge and the resulting negative utility, that is.
[PPS.] I can’t imagine how an awesomeness pill would max my sliders for self-improvement, accomplishment, etc without actually giving me the illusion of doing those things. As in, I can imagine feeling intense pleasure; I can’t imagine feeling intense achievement separated from actually flying—or imagining that I’m flying—a spaceship—it wouldn’t feel as fulfilling, and it makes no sense that an awesomeness pill would separate them if it’s possible not to. It probably wouldn’t have me go through the roundabout process of doing all the stuff, and it probably would max my sliders even if I can’t imagine it, to an effect much different from the roundabout way, and by definition superior. As long as it doesn’t modify my utility function (as long as I value flying space ships), I don’t mind.
This is a key assumption. Sure, if I assume that the universe is such that no choice I make affects the chances that a child I care about will starve—and, more generally, if I assume that no choice I make affects the chances that people will gain good stuff or bad stuff—then sure, why not wirehead? It’s not like there’s anything useful I could be doing instead.
But some people would, in that scenario, object to the state of the world. Some people actually want to be able to affect the total amount of good and bad stuff that people get.
And, sure, the rest of us could get together and lie to them (e.g., by creating a simulation in which they believe that’s the case), though it’s not entirely clear why we ought to. We could also alter them (e.g., by removing their desire to actually do good) but it’s not clear why we ought to do that, either.
Do you mean to distinguish this from believing that you have flown a spaceship?
Don’t we have to do it (lying to people) because we value other people being happy? I’d rather trick them (or rather, let the AI do so without my knowledge) than have them spend a lot of time angsting about not being able to help anyone because everyone was already helped. (If there are people who can use your help, I’m not about to wirehead you though)
Yes. Thinking about simulating achievement got me confused about it. I can imagine intense pleasure or pain. I can’t imagine intense achievement; if I just got the surge of warmth I normally get, it would feel wrong, removed from flying a spaceship. Yet, that doesnt mean that I don’t have an achievement slider to max; it just means I can’t imagine what maxing it indefinitely would feel like. Maxing the slider leading to hallucinations about performing activities related to achievement seems too roundabout—really, that’s the only thing I can say; it feels like it won’t work that way. Can the pill satisfy terminal values without making me think I satisfied them? I think this question shows that the sentence before it is just me being confused. Yet I can’t imagine how an awesomeness pill would feel, hence I can’t dispel this annoying confusion.
[EDIT] Maybe a pill that simply maxes the sliders would make me feel achievement, but without flying a spaceship, hence making it incomplete, hence forcing the AI to include a spaceship hallucinator. I think I am/was making it needlessly complicated. In any case, the general idea is that if we are all opposed to just feeling intense pleasure without all the other stuff we value, then a pill that gives us only intense pleasure is flawed and would not even be given as an option.
Regarding the first bit… well, we have a few basic choices:
Change the world so that reality makes them happy
Change them so that reality makes them happy
Lie to them about reality, so that they’re happy
Accept that they aren’t happy
If I’m understanding your scenario properly, we don’t want to do the first because it leaves more people worse off, and we don’t want to do the last because it leaves us worse off. (Why our valuing other people being happy should be more important than their valuing actually helping people, I don’t know, but I’ll accept that it is.)
But why, on your view, ought we lie to them, rather than change them?
I attach negative utility to getting my utility function changed—I wouldn’t change myself to maximize paperclips. I also attach negative utility to getting my memory modified—I don’t like the normal decay that is happening even now, but far worse is getting a large swath of my memory wiped. I also dislike being fed negative information, but that is by far the least negative of the three, provided no negative consequences arise from the false belief. Hence, I’d prefer being fed negative information to having my memory modified to being made to stop caring about other people altogether. There is an especially big gap between the last one and the former two.
Thanks for summarizing my argument. I guess I need to work on expressing myself so I don’t force other people to work through my roundaboutness :)
Fair enough. If you have any insight into why your preferences rank in this way, I’d be interested, but I accept that they are what they are.
However, I’m now confused about your claim.
Are you saying that we ought to treat other people in accordance with your preferences of how to be treated (e.g., lied to in the present rather than having their values changed or their memories altered)? Or are you just talking about how you’d like us to treat you? Or are you assuming that other people have the same preferences you do?
For the preference ranking, I guess I can mathematically express it by saying that any priority change leads to me doing stuff that would be utility+ at the time, but utility- or utilityNeutral (and since I could be spending the time generating utility+ instead, even neutral is bad) now. For example, if I could change my utility function to eating babies, and babies were plentiful, this option would result in a huge source of utility+ after the change. Which doesn’t change the fact that it also means I’d eat a ton of babies, which makes the option a huge source of utility- currently—I wouldn’t want to do something that would lead to me eating a ton of babies. If I greatly valued generating as much utility+ for myself at any moment as possible, I would take the plunge; however, I look at the future, decide not to take what is currently utility- for me, and move on. Or maybe I’m just making up excuses to refuse to take a momentary discomfort for eternal utility+ - after all, I bet someone having the time of his life eating babies would laugh at me and have more fun than me—the inconsistency here is that I avoid the utility- choice when it comes to changing my terminal values, but I have no issue taking the utility- choice when I decide I want to be in a simulation. Guess I don’t value truth that much. I find that changing my memories leads to similar results as changing my utility function, but on a much, much smaller scale—after all, they are what make up my beliefs, preferences, myself as a person. Changing them at all changes my belief system and preference; but that’s happening all the time. Changing them on a large scale is significantly worse in terms of affecting my utility function—it can’t change my terminal values, so still far less bad than directly making me interested in eating babies, but still negative. Getting lied to is just bad, with no relation to the above two, and weakest in importance.
My gut says that I should treat others as I want them to treat me. Provided a simulation is a bit more awesome, or comparably awesome but more efficient, I’d rather take that than the real thing. Hence, I’d want to give others what I myself prefer (in terms of ways to achieve preferences) - not because they are certain to agree that being lied to is better than angsting about not helping people, but because my way is either better or worse than theirs, and I wouldn’t believe in my way unless I though it better. Of course, I am also assuming that truth isn’t a terminal value to them. In the same way, since I don’t want my utility function changed, I’d rather not do it to them.
I don’t understand this. If your utility function is being deceived, then you don’t value the true state of affairs, right? Unless you value “my future self feeling utility” as a terminal value, and this outweighs the value of everything else …
No, this is more about deleting a tiny discomfort—say, the fact that I know that all of it is an illusion; I attach a big value to my memory and especially disagree with sweeping changes to it, but I’ll rely on the pill and thereby the AI to make the decision what shouldn’t be deleted because doing so would interfere with the fulfillment of my terminal values and what can be deleted because it brings negative utility that isn’t necessary.
Intellectually, I wouldn’t care whether I’m the only drugged brain in a world where everyone is flying real spaceships. I probably can’t fully deal with the intuition telling me I’m drugged though. It’s not highly important—just a passing discomfort when I think about the particular topic (passing and tiny, unless there are starving children). Whether its worth keeping around so I can feel in control and totally not drugged and imprisoned...I guess that’s reliant on the circumstances.
So you’re saying that your utility function is fine with the world-as-it-is, but you don’t like the sensation of knowing you’re in a vat. Fair enough.