Rob Bensinger answers Why do you reject negative utilitarianism?

Rob Bensinger 12 Feb 2019 1:22 UTC
48 points
0
I find negative utilitarianism unappealing for roughly the same reason I’d find “we should only care about disgust” or “we should only care about the taste of bananas” unappealing. Or if you think suffering is much closer to a natural kind than disgust, then supply some other mental (or physical!) state that seems more natural-kind-ish to you.
“Only suffering ultimately matters” and “only the taste of bananas ultimately matters” share the virtue of simplicity, but they otherwise run into the same difficulty, which is just that they don’t exhaustively describe all the things I enjoy or want or prefer. I don’t think my rejection of bananatarianism has to be any more complicated than that.
Something I wrote last year in response to a tangentially related paper:
I personally care about things other than suffering. What are negative utilitarians saying about that?
Are they saying that they don’t care about things like friendship, good food, joy, catharsis, adventure, learning new things, falling in love, etc., except as mechanisms for avoiding suffering? Are they saying that I’m deluded about having preferences like those? Are they saying that I should try to change my preferences — and if so, why? Are they saying that my preferences are fine in my personal decision-making as an individual, but shouldn’t get any weight in an idealized negotiation about what humanity as a group should do (ignoring any weight my preferences get from non-NU views that might in fact warrant a place at the bargaining table for more foundational or practical reasons distinct from the NU ideal) — and if so, why?
[...] “It’s wrong to ever base any decision whatsoever on my (or anyone else’s) enjoyment of anything whatsoever in life, except insofar as that enjoyment has downstream effects on other things” is an incredibly, amazingly strong claim. And it’s important in this context that you’re actually making that incredibly strong claim: more mild “negative-leaning” utilitarianisms (which probably shouldn’t be associated with NU, given how stark the difference is) don’t have to deal with the version of the world destruction argument I think x-risk people tend to be concerned about, which is not ‘in some scenarios, careful weighing of the costs and benefits can justify killing lots of people’ but rather ‘any offsets or alternatives to building misaligned resource-hungry AGI (without suffering subsystems) get literally zero weight, if you’re sufficiently confident that that’s what you’re building; there’s no need to even consider them; they aren’t even a feather on the scale’. I just don’t see why the not-even-a-feather-on-the-scale view deserves any more attention or respect than, e.g., divine-command theory; in an argument between the “negative-leaning” utilitarian and the real negative utilitarian, I don’t think the NU gets any good hits in.
(Simplicity is a virtue, but not when it’s of the “I’m going to attempt to disregard every consideration in all of my actions going forward except the expected amount of deliciousness in the future” or ”… except the expected amount of lying in the future” variety; so simplicity on its own doesn’t raise the view to the level of having non-negligible probability compared to negative-learning U.)
What links here?
- Twitter-length responses to 24 AI alignment arguments by RobBensinger (EA Forum; 14 Mar 2022 19:34 UTC; 67 points)
- Teo Ajantaival 12 Feb 2019 20:28 UTC
  4 points
  Parent
  Yes, I am making the (AFAICT, in your perspective) “incredibly, amazingly strong claim” that in a unified theory, only suffering ultimately matters. In other words, impartial compassion is the ultimate scale (comparator) to decide conflicts between expected suffering vs. other values (whose common basis for this comparison derives from their complete, often context-dependent relationship to expected suffering, including accounting for the wider incentives & long-term consequences from breaking rules that are practically always honored).
  I find negative utilitarianism unappealing for roughly the same reason I’d find “we should only care about disgust” or “we should only care about the taste of bananas” unappealing.
  Roughly? Suffering is not an arbitrary foundation for unification (for a “common currency” underlying an ideally shared language for cause prioritization). Suffering is the clearest candidate for a thing we all find terminally motivating, at least once we know what we’re talking about (i.e., aren’t completely out of touch with the nature of extreme suffering, as evaluators & comparators of experiences are expected not to be). Personally, I avoid the semantics of arguing over what we “should” care about. Instead, I attempt to find out what I do care about, what these values’ motivational currency is ultimately derived from, and how could I unify these findings into a psychologically realistic model with consistent practical implications & minimal irreconcilable contradictions (such as outweighing between multiple terminal values, because I’m a skeptic of aggregation over space and time; aggregate experiences physically exist only as thoughts not fit to outweigh suffering, which only preventing more suffering can do).
  “Only suffering ultimately matters” and “only the taste of bananas ultimately matters” share the virtue of simplicity, but they otherwise run into the same difficulty, which is just that they don’t exhaustively describe all the things I enjoy or want or prefer. I don’t think my rejection of bananatarianism has to be any more complicated than that.
  I agree re: bananatarianism, but there’s more to unpack from the suffering-motivated unification than meets the eye.
  No verbal descriptions can exhaustively describe all the things we enjoy, want, or prefer, because our inner homeostatic & psychological dynamics contain processes that are too multidimensional for simple overarching statements. What we can do, is unpack the implications of {“Only suffering ultimately matters”} to see how this can imply, predict, retrodict, and explain our other motivations.
  In evolutionary and developmental history terms, we can see at the first quick glance that many (if not immediately all) of our other motivations interact with suffering, or have interacted with our suffering in the past (individually, neurally, culturally, evolutionarily). They serve functions of group cohesion, coping with stress, acquiring resources, intimacy, adaptive learning & growth, social deterrence, self-protection, understanding ourselves, and various other things we value & honor because they make life easier or interesting. Neither NU nor other systems will honor all of our perceived wants as absolutes to maximize (reproduction) or to even respect at all (revenge; animalistic violence; desire for personal slaves or worship), but most of our intuitively nominated “terminal” values need not be overriden by the slightest suffering, because they do serve strong functions to prevent suffering, especially when they seem to us like autonomous goals without constantly reminding us of how horrible things did and would happen without them. NU simply claims that it is the most diplomatic solution for a unified theory to de-attach from other values as absolutes, and respect them to the degree that we need them (practicing epistemic uncertainty when we do not yet understand the full role of something we intuitively deeply value!). This may in practice lead to great changes, ideally in directions of more self-compassion and general compassion for others without our other values overriding our motivation to prevent as many cases of extreme suffering as we can.
  A considerate rejection of NU needs to be more complicated than of bananatarianism, because the unity, applicability, and explanatory power of NU relies on its implications (instead of explicit absolute rules or independent utility assignments for exemplary grounding-units of every value—challenges for other systems), and its weights of instrumental value depend not on static snapshots of worlds, but on the full context and counterfactuals epistemically accessible to us in each situation. In extreme situations, we may decide it worthwhile to simulate the expected long-term consequences of possibly bending rules that we normally accept as near-absolute heuristics to save ourselves from resource-intensive overthinking (e.g., the degree to which we respect someone’s autonomy, in emergencies). This doesn’t imply that superweapon research is a low-hanging fruit for aspiring suffering-minimizers (for reasons I won’t detail right now, because I find the world-wiping objections worth addressing mostly in the context of AGI assumptions; my primary interest here is unification for practical cause prioritization, worth noting).
  To actually reject NU, you must explain what makes something (other than suffering) terminally valuable (or as I say, motivating) beyond its instrumental value for helping us prevent suffering in the total context. This instrumental value is multifaceted and can be derived from various kinds of relationships to suffering. So other “terminal” values may serve important purposes, including that they help us (some examples in parentheses):
  - cope with suffering (coping mechanisms, friendship, community)
  - avoid ruminating on suffering (explicit focus on expansive, positive language and goals that don’t contain reminders of their possibly suffering-mediated usefulness)
  - re-interpret suffering (humor, narratives, catharsis)
  - prevent suffering (science, technology, cognitive skills & tools)
  - understand suffering (wide & deep personal experience, culture)
  - predict suffering (science)
  - skip epistemic difficulties of trying to optimize others’ suffering for them (autonomy)
  - prevent the abuse of our being motivated by suffering (human rights, justice system, deterrence)
  - help others’ suffering (life, health, freedom, personal extra resources to invest as we see fit, reducing x-risk)
  - resilience against suffering (experience, intelligence, learning, cultural memory)
  - safety against emergencies (family, intimacy, community, institutions)
  - help us relax and regain our ability to help (good food, joy, replenishing activities)
  - avoid a horrible, anxiety-epidemic spreading societal collapse (not getting caught secretly killing people and everyone who knew them, in the name of compassion, by not doing this)
  To reject NU, is there some value you want to maximize beyond self-compassion and its role for preventing suffering, at the risk of allowing extreme suffering? How will you tell this to someone undergoing extreme suffering?
  NUs are not saying you are deluded for valuing multiple things. But you may be overly attached to them if you—beyond self-compassion—would want to spend your attention on copying/boosting instances of them rather than on preventing others from having to undergo extreme suffering.
  After writing this, I wonder if the actual disagreement is still the fear that an NU-AGI would consider humans less {instrumentally valuable for preventing suffering} than it would consider {our suffering terminally worth preventing}. This feels like a very different conversation than what would be a useful basis for a common language of cause prioritization.
  - Rob Bensinger 13 Feb 2019 0:42 UTC
    19 points
    0
    Parent
    In evolutionary and developmental history terms, we can see at the first quick glance that many (if not immediately all) of our other motivations interact with suffering, or have interacted with our suffering in the past (individually, neurally, culturally, evolutionarily). They serve functions of group cohesion, coping with stress, acquiring resources, intimacy, adaptive learning & growth, social deterrence, self-protection, understanding ourselves, and various other things we value & honor because they make life easier or interesting.
    Seems like all of this could also be said of things like “preferences”, “enjoyment”, “satisfaction”, “feelings of correctness”, “attention”, “awareness”, “imagination”, “social modeling”, “surprise”, “planning”, “coordination”, “memory”, “variety”, “novelty”, and many other things.
    “Preferences” in particular seems like an obvious candidate for ‘thing to reduce morality to’; what’s your argument for only basing our decisions on dispreference or displeasure and ignoring positive preferences or pleasure (except instrumentally)?
    Neither NU nor other systems will honor all of our perceived wants as absolutes to maximize
    I’m not sure I understand your argument here. Yes, values are complicated and can conflict with each other. But I’d rather try to find reasonable-though-imperfect approximations and tradeoffs, rather than pick a utility function I know doesn’t match human values and optimize it instead just because it’s uncomplicated and lets us off the hook for thinking about tradeoffs between things we ultimately care about.
    E.g., I like pizza. You could say that it’s hard to list every possible flavor I enjoy in perfect detail and completeness, but I’m not thereby tempted to stop eating pizza, or to try to reduce my pizza desire to some other goal like ‘existential risk minimization’ or ‘suffering minimization’. Pizza is just one of the things I like.
    To actually reject NU, you must explain what makes something (other than suffering) terminally valuable (or as I say, motivating) beyond its instrumental value for helping us prevent suffering in the total context
    E.g.: I enjoy it. If my friends have more fun watching action movies than rom-coms, then I’ll happily say that that’s sufficient reason for them to watch more action movies, all on its own.
    Enjoying action movies is less important than preventing someone from being tortured, and if someone talks too much about trivial sources of fun in the context of immense suffering, then it makes sense to worry that they’re a bad person (or not sufficiently in touch with their compassion).
    But I understand your position to be not “torture matters more than action movies”, but “action movies would ideally have zero impact on our decision-making, except insofar as it bears on suffering”. I gather that from your perspective, this is just taking compassion to its logical conclusion; assigning some more value to saving horrifically suffering people than to enjoying a movie is compassionate, so assigning infinitely more value to the one than the other seems like it’s just dialing compassion up to 11.
    One reason I find this uncompelling is that I don’t think the right way to do compassion is to ignore most of the things people care about. I think that helping people requires doing the hard work of figuring out everything they value, and helping them get all those things. That might reduce to “just help them suffer less” in nearly all real-world decisions nowadays, because there’s an awful lot of suffering today; but that’s a contingent strategy based on various organisms’ makeup and environment in 2019, not the final word on everything that’s worth doing in a life.
    To reject NU, is there some value you want to maximize beyond self-compassion and its role for preventing suffering, at the risk of allowing extreme suffering? How will you tell this to someone undergoing extreme suffering?
    I’ll tell them I care a great deal about suffering, but I don’t assign literally zero importance to everything else.
    NU people I’ve talked to often worry about scenarios like torture vs. dust specks, and that if we don’t treat happiness as literally of zero value, then we might make the wrong tradeoff and cause immense harm.
    The flip side is dilemmas like:
    Suppose you have a chance to push a button that will annihilate all life in the universe forever. You know for a fact that if you don’t push it, then billions of people will experience billions upon billions of years of happy, fulfilling, suffering-free life, filled with richness, beauty, variety, and complexity; filled with the things that make life most worth living, and with relationships and life-projects that people find deeply meaningful and satisfying.
    However, you also know for a fact that if you don’t push the button, you’ll experience a tiny, almost-unnoticeable itch on your left shoulder blade a few seconds later, which will be mildly unpleasant for a second or two before the Utopian Future begins. With this one exception, no suffering will ever again occur in the universe, regardless of whether you push the button. Do you push the button, because your momentary itch matters more than all of the potential life and happiness you’d be cutting out?
    - Rob Bensinger 13 Feb 2019 0:45 UTC
      8 points
      0
      Parent
      And if you say “I don’t push the button, but only because I want to cooperate with other moral theorists” or “I don’t push the button, but only because NU is very very likely true but I have nonzero moral uncertainty”: do you really think that’s the reason? Does that really sound like the prescription of the correct normative theory (modulo your own cognitive limitations and resultant moral uncertainty)? If the negotiation-between-moral-theories spat out a slightly different answer, would this actually be a good idea?
  - Raemon 12 Feb 2019 21:34 UTC
    14 points
    1
    Parent
    This comment doesn’t seem to sufficiently engage with (what I saw as) the core question Rob was asking (and which I would ask), which was:
    I personally care about things other than suffering. What are negative utilitarians saying about that?
    Are they saying that they don’t care about things like friendship, good food, joy, catharsis, adventure, learning new things, falling in love, etc., except as mechanisms for avoiding suffering? Are they saying that I’m deluded about having preferences like those? Are they saying that I should try to change my preferences — and if so, why?
    You briefly note “you may be overly attached to them”, but this doesn’t give any arguments for why I might be overly attached to them, instead of attached to them the correct amount.
    When you ask:
    To actually reject NU, you must explain what makes something (other than suffering) terminally valuable (or as I say, motivating) beyond its instrumental value for helping us prevent suffering in the total context.
    My response is “to reject NU, all I have to do is terminally care about anything other than suffering. I care about things other than suffering, ergo NU must be false, and the burden is on other people to explain what is wrong with my preferences.”