I think the most general response to your first three points would look something like this: Any superintelligence that achieves human values will be adjacent in design space to many superintelligences that cause massive suffering, so it’s quite likely that the wrong superintelligence will win, due to human error, malice, or arms races.
As to your last point, it looks more like a research problem than a counterargument, and I’d be very interested in any progress on that front :-)
So being served a cup of coffee and being served a cup of pure capsaicin are “adjacent in design space”? Maybe, but funny how that problem doesn’t arise or even worry anyone...
That’s a twist on a standard LW argument, see e.g. here:
Fragility of value is the thesis that losing even a small part of the rules that make up our values could lead to results that most of us would now consider as unacceptable
It seems to me that fragility of value can lead to massive suffering in many ways.
You’re basically dialing that argument up to eleven. From “losing a small part could lead to unacceptable results” you are jumping to “losing any small part will lead to unimaginable hellscapes”:
with a tall sharp peak (FAI) surrounded by a pit that’s astronomically deeper
Yeah, not all parts. But even if it’s a 1% chance, one hellscape might balance out a hundred universes where FAI wins. Pain is just too effective at creating disutility. I understand why people want to be optimistic, but I think being pessimistic in this case is more responsible.
So basically you are saying that the situation is asymmetric: the impact/magnitude of possible bad things is much much greater than the impact/magnitude of possible good things. Is this correct?
Yeah. One sign of asymmetry is that creating two universes, one filled with pleasure and the other filled with pain, feels strongly negative rather than symmetric to us. Another sign is that pain is an internal experience, while our values might refer to the external world (though it’s very murky), so the former might be much easier to achieve. Another sign is that in our world it’s much easier to create a life filled with pain than a life that fulfills human values.
Yes, many people intuitively feel that a universe of pleasure and a universe of pain add to a net negative. But I suspect that’s just a result of experiencing (and avoiding) lots of sources of extreme pain in our lives, while sources of pleasure tend to be diffuse and relatively rare. The human experience of pleasure is conjunctive because in order to survive and reproduce you must fairly reliably avoid all types of extreme pain. But in a pleasure-maximizing environment, removing pain will be a given.
It’s also true that our brains tend to adapt to pleasure over time, but that seems simple to modify once physiological constraints are removed.
“one filled with pleasure and the other filled with pain, feels strongly negative rather than symmetric to us”
Comparing pains and pleasures of similar magnitude? People have a tendency not to do this, see the linked thread.
“Another sign is that pain is an internal experience, while our values might refer to the external world (though it’s very murky”
You accept pain and risk of pain all the time to pursue various pleasures, desires and goals. Mice will cross electrified surfaces for tastier treats.
If you’re going to care about hedonic states as such, why treat the external case differently?
Alternatively, if you’re going to dismiss pleasure as just an indicator of true goals (e.g. that pursuit of pleasure as such is ‘wireheading’) then why not dismiss pain in the same way, as just a signal and not itself a goal?
Comparing pains and pleasures of similar magnitude?
My point was comparing pains and pleasures that could be generated with similar amount of resources. Do you think they balance out for human decision making? For example, I’d strongly disagree to create a box of pleasure and a box of pain, do you think my preference would go away after extrapolation?
“My point was comparing pains and pleasures that could be generated with similar amount of resources. Do you think they balance out for human decision making?”
I think with current tech it’s cheaper and easier to wirehead to increase pain (i.e. torture) than to increase pleasure or reduce pain. This makes sense biologically, since organisms won’t go looking for ways to wirehead to maximize their own pain, evolution doesn’t need to ‘hide the keys’ as much as with pleasure or pain relief (where the organism would actively seek out easy means of subverting the behavioral functions of the hedonic system). Thus when powerful addictive drugs are available, such as alcohol, human populations evolve increased resistance over time. The sex systems evolve to make masturbation less rewarding than reproductive sex under ancestral conditions, desire for play/curiosity is limited by boredom, delicious foods become less pleasant when full or the foods are not later associated with nutritional sensors in the stomach, etc.
I don’t think this is true with fine control over the nervous system (or a digital version) to adjust felt intensity and behavioral reinforcement. I think with that sort of full access one could easily increase the intensity (and ease of activation) of pleasures/mood such that one would trade them off against the most intense pains at ~parity per second, and attempts at subjective comparison when or after experiencing both would put them at ~parity.
People will willingly undergo very painful jobs and undertakings for money, physical pleasures, love, status, childbirth, altruism, meaning, etc. Unless you have a different standard for the ‘boxes’ than used in subjective comparison with rich experience of the things to be compared I think we just haggling over the price re intensity.
We know the felt caliber and behavioral influence of such things can vary greatly. It would be possible to alter nociception and pain receptors to amp up or damp down any particular pain. This could even involve adding a new sense, e.g. someone with congenital deafness could be given the ability to hear (installing new nerves and neurons), and hear painful sounds, with artificially set intensity of pain. Likewise one could add a new sense (or dial one up) to enable stronger pleasures. I think that both the new pains and new pleasures would ‘count’ to the same degree (and if you’re going to dismiss the pleasures as ‘wireheading’ then you should dismiss the pains too).
″ For example, I’d strongly disagree to create a box of pleasure and a box of pain, do you think my preference would go away after extrapolation?”
You trade off pain and pleasure in your own life, are you saying that the standard would be different for the boxes than for yourself?
What are you using as the examples to represent the boxes, and have you experienced them? (As discussed in my link above, people often use weaksauce examples in such comparison.)
We could certainly make agents for whom pleasure and pain would use equal resources per util. The question is if human preferences today (or extrapolated) would sympathize with such agents to the point of giving them the universe. Their decision-making could look very inhuman to us. If we value such agents with a discount factor, we’re back at square one.
That’s what the congenital deafness discussion was about.
You have preferences over pain and pleasure intensities that you haven’t experienced, or new durations of experiences you know. Otherwise you wouldn’t have anything to worry about re torture, since you haven’t experienced it.
Pain asymbolia is a condition in which pain is perceived, but with an absence of the suffering that is normally associated with the pain experience. Individuals with pain asymbolia still identify the stimulus as painful but do not display the behavioral or affective reactions that usually accompany pain; no sense of threat and/or danger is precipitated by pain.
Suppose you currently had pain asymbolia. Would that mean you wouldn’t object to pain and suffering in non-asymbolics? What if you personally had only happened to experience extremely mild discomfort while having lots of great positive experiences? What about for yourself? If you knew you were going to get a cure for your pain asymbolia tomorrow would you object to subsequent torture as intrinsically bad?
We can go through similar stories for major depression and positive mood.
Seems it’s the character of the experience that matters.
Likewise, if you’ve never experienced skiing, chocolate, favorite films, sex, victory in sports, and similar things that doesn’t mean you should act as though they have no moral value. This also holds true for enhanced experiences and experiences your brain currently is unable to have, like the case of congenital deafness followed by a procedure to grant hearing and listening to music.
Music and chocolate are known to be mostly safe. I guess I’m more cautious about new self-modifications that can change my decisions massively, including decisions about more self-modifications. It seems like if I’m not careful, you can devise a sequence that will turn me into a paperclipper. That’s why I discount such agents for now, until I understand better what CEV means.
I think the most general response to your first three points would look something like this: Any superintelligence that achieves human values will be adjacent in design space to many superintelligences that cause massive suffering, so it’s quite likely that the wrong superintelligence will win, due to human error, malice, or arms races.
As to your last point, it looks more like a research problem than a counterargument, and I’d be very interested in any progress on that front :-)
Why so? Flipping the sign doesn’t get you “adjacent”, it gets you “diametrically opposed”.
If you really want chocolate ice cream, “adjacent” would be getting strawberry ice cream, not having ghost pepper extract poured into your mouth.
They said “adjacent in design space”. The Levenshtein distance between
return val;
andreturn -val;
is 1.So being served a cup of coffee and being served a cup of pure capsaicin are “adjacent in design space”? Maybe, but funny how that problem doesn’t arise or even worry anyone...
More like driving to the store and driving into the brick wall of the store are adjacent in design space.
That’s a twist on a standard LW argument, see e.g. here:
It seems to me that fragility of value can lead to massive suffering in many ways.
You’re basically dialing that argument up to eleven. From “losing a small part could lead to unacceptable results” you are jumping to “losing any small part will lead to unimaginable hellscapes”:
Yeah, not all parts. But even if it’s a 1% chance, one hellscape might balance out a hundred universes where FAI wins. Pain is just too effective at creating disutility. I understand why people want to be optimistic, but I think being pessimistic in this case is more responsible.
So basically you are saying that the situation is asymmetric: the impact/magnitude of possible bad things is much much greater than the impact/magnitude of possible good things. Is this correct?
Yeah. One sign of asymmetry is that creating two universes, one filled with pleasure and the other filled with pain, feels strongly negative rather than symmetric to us. Another sign is that pain is an internal experience, while our values might refer to the external world (though it’s very murky), so the former might be much easier to achieve. Another sign is that in our world it’s much easier to create a life filled with pain than a life that fulfills human values.
Yes, many people intuitively feel that a universe of pleasure and a universe of pain add to a net negative. But I suspect that’s just a result of experiencing (and avoiding) lots of sources of extreme pain in our lives, while sources of pleasure tend to be diffuse and relatively rare. The human experience of pleasure is conjunctive because in order to survive and reproduce you must fairly reliably avoid all types of extreme pain. But in a pleasure-maximizing environment, removing pain will be a given.
It’s also true that our brains tend to adapt to pleasure over time, but that seems simple to modify once physiological constraints are removed.
“one filled with pleasure and the other filled with pain, feels strongly negative rather than symmetric to us”
Comparing pains and pleasures of similar magnitude? People have a tendency not to do this, see the linked thread.
“Another sign is that pain is an internal experience, while our values might refer to the external world (though it’s very murky”
You accept pain and risk of pain all the time to pursue various pleasures, desires and goals. Mice will cross electrified surfaces for tastier treats.
If you’re going to care about hedonic states as such, why treat the external case differently?
Alternatively, if you’re going to dismiss pleasure as just an indicator of true goals (e.g. that pursuit of pleasure as such is ‘wireheading’) then why not dismiss pain in the same way, as just a signal and not itself a goal?
My point was comparing pains and pleasures that could be generated with similar amount of resources. Do you think they balance out for human decision making? For example, I’d strongly disagree to create a box of pleasure and a box of pain, do you think my preference would go away after extrapolation?
“My point was comparing pains and pleasures that could be generated with similar amount of resources. Do you think they balance out for human decision making?”
I think with current tech it’s cheaper and easier to wirehead to increase pain (i.e. torture) than to increase pleasure or reduce pain. This makes sense biologically, since organisms won’t go looking for ways to wirehead to maximize their own pain, evolution doesn’t need to ‘hide the keys’ as much as with pleasure or pain relief (where the organism would actively seek out easy means of subverting the behavioral functions of the hedonic system). Thus when powerful addictive drugs are available, such as alcohol, human populations evolve increased resistance over time. The sex systems evolve to make masturbation less rewarding than reproductive sex under ancestral conditions, desire for play/curiosity is limited by boredom, delicious foods become less pleasant when full or the foods are not later associated with nutritional sensors in the stomach, etc.
I don’t think this is true with fine control over the nervous system (or a digital version) to adjust felt intensity and behavioral reinforcement. I think with that sort of full access one could easily increase the intensity (and ease of activation) of pleasures/mood such that one would trade them off against the most intense pains at ~parity per second, and attempts at subjective comparison when or after experiencing both would put them at ~parity.
People will willingly undergo very painful jobs and undertakings for money, physical pleasures, love, status, childbirth, altruism, meaning, etc. Unless you have a different standard for the ‘boxes’ than used in subjective comparison with rich experience of the things to be compared I think we just haggling over the price re intensity.
We know the felt caliber and behavioral influence of such things can vary greatly. It would be possible to alter nociception and pain receptors to amp up or damp down any particular pain. This could even involve adding a new sense, e.g. someone with congenital deafness could be given the ability to hear (installing new nerves and neurons), and hear painful sounds, with artificially set intensity of pain. Likewise one could add a new sense (or dial one up) to enable stronger pleasures. I think that both the new pains and new pleasures would ‘count’ to the same degree (and if you’re going to dismiss the pleasures as ‘wireheading’ then you should dismiss the pains too).
″ For example, I’d strongly disagree to create a box of pleasure and a box of pain, do you think my preference would go away after extrapolation?”
You trade off pain and pleasure in your own life, are you saying that the standard would be different for the boxes than for yourself?
What are you using as the examples to represent the boxes, and have you experienced them? (As discussed in my link above, people often use weaksauce examples in such comparison.)
We could certainly make agents for whom pleasure and pain would use equal resources per util. The question is if human preferences today (or extrapolated) would sympathize with such agents to the point of giving them the universe. Their decision-making could look very inhuman to us. If we value such agents with a discount factor, we’re back at square one.
That’s what the congenital deafness discussion was about.
You have preferences over pain and pleasure intensities that you haven’t experienced, or new durations of experiences you know. Otherwise you wouldn’t have anything to worry about re torture, since you haven’t experienced it.
Consider people with pain asymbolia:
Suppose you currently had pain asymbolia. Would that mean you wouldn’t object to pain and suffering in non-asymbolics? What if you personally had only happened to experience extremely mild discomfort while having lots of great positive experiences? What about for yourself? If you knew you were going to get a cure for your pain asymbolia tomorrow would you object to subsequent torture as intrinsically bad?
We can go through similar stories for major depression and positive mood.
Seems it’s the character of the experience that matters.
Likewise, if you’ve never experienced skiing, chocolate, favorite films, sex, victory in sports, and similar things that doesn’t mean you should act as though they have no moral value. This also holds true for enhanced experiences and experiences your brain currently is unable to have, like the case of congenital deafness followed by a procedure to grant hearing and listening to music.
Music and chocolate are known to be mostly safe. I guess I’m more cautious about new self-modifications that can change my decisions massively, including decisions about more self-modifications. It seems like if I’m not careful, you can devise a sequence that will turn me into a paperclipper. That’s why I discount such agents for now, until I understand better what CEV means.