Meta-Preference Utilitarianism
There has been a long standing debate between utilitarians about what should be maximized. Most fall on either the side of ‘average utility’ or ‘total utility’. A select group even chooses ‘minimum utility’ or other lesser known methods.
Previously I have tried to solve this by charting different moral-theories on different axis and by prioritizing those actions that achieve success in most moral theories (see Q-balance utilitarianism).
I have come to the conclusion that this is just a bandaid on a more fundamental problem. Whether we should choose total, average or even median utility isn’t something we could objectively decide. So I suggest that we go up one level, and maximize what most people want to maximize.
Let’s say we were able to gauge everyone’s (underlying) preferences about how much they like certain methods of maximizing by holding a so called utilitarian vote.
A utilitarian method that would violate your preferences completely would get a score of 0, one that would encapsulate your preferences perfectly would get a score of 1 and one that you like to a certain extent but not completely gets e.g 0,732.
If there is such a thing as ‘meta-preference ambivalence’ we could gauge that too: “People who do not have any meta-preferences in their utility-function get a score of 0, people for whom the entire purpose in life is the promotion of average utilitarianism will get a score of 1 etc.
Just multiply the ambivalence with the meta-preference and then add all the scores of the individual methods together (add all the scores of the preferences for “median utility” together, add all the scores for “total utility” together etc) and compare.
Now one way of maximizing comes out on top, but should we pursue it absolutely or proportionally? If “total utilitarianism” wins with 70% and “average utilitarianism” loses with 30% of the vote, should we act as “total utilitarians” 100% of the time or 70% of the time? (with average utilitarianism 30% of the time at random intervals)
Well, we could solve that with a vote too: “Would you prefer pursuing the winning method 100% of the time or would you prefer pursuing it proportionally to it’s victory?”
And any other problem we’ll face after this will be decided by comparing people’s preferences about those problems as well. (This might, depending on how you interpret it, also solve the negative-positive utilitarianism debate)
This might be encroaching on contractualism, but I think this is a very elegant way to solve some of the meta-level disagreements in utilitarian philosophy.
- A Toy Model of Hingeyness by 7 Sep 2020 17:38 UTC; 16 points) (
- 5 Feb 2020 19:47 UTC; 11 points) 's comment on Meta-Preference Utilitarianism by (EA Forum;
- [Meta] Three small suggestions for the LW-website by 20 May 2020 11:18 UTC; 9 points) (
- 21 May 2020 10:27 UTC; 3 points) 's comment on Dagon’s Shortform by (
This type of procedure may look inelegant for folks who expect population ethics to have an objectively correct solution. However, I think it’s confused to expect there to be such an objective solution. In my view at least, this makes the procedure described in the original post look pretty attractive as a way to move forward.
Because it includes some very similar considerations as are presented in the original post here, I’ll try to (for those who are curious enough to bear with me) describe the framework I’ve been using for thinking about population ethics:
Ethical value is subjective in the sense that, if someone’s life goal is to strive toward state x, it’s no one’s business to tell them that they should focus on y instead. (There may be exceptions, e.g., in case someone’s life goals are the result of brain washing).
For decisions that do not involve the creation of new sentient beings, preference utilitarianism or “bare minimum contractualism” seem like satisfying frameworks. Preference utilitarians are ambitiously cooperative/altruistic and scale back any other possible life goals at the expense of getting maximal preference satisfaction for everyone, whereas “bare-minimum contractualists” obey principles like “do no harm” while mostly focusing on their own life goals. A benevolent AI should follow preference utilitarianism, whereas individual people are free to decide for anything on the spectrum between full preference utilitarianism and bare-minimum contractualism. (Bernard William’s famous objection to utilitarianism is that it undermines a person’s “integrity” by alienating them from their own life goals. By focusing all their resources and attention on doing what’s best from everyone’s point of view, people don’t get to do anything that’s good for themselves. This seems okay if one consciously chooses altruism as a way of life, but overly demanding as an all-encompassing morality).
When it comes to questions that affect the creation of new beings, the principles behind preference utilitarianism or bare-minimum contractualism fail to constrain all of the option space. In other words: population ethics is under-defined.
That said, it’s not the case that “anything goes.” Just because present populations have all the power doesn’t mean that it’s morally permissible to ignore any other-regarding considerations about the well-being of possible future people. We can conceptualize a bare-minimum version of population ethics as a set of appeals or principles by which newly created beings can hold accountable their creators. This could include principles such as:
All else equal, it seems objectionable to create minds that lament their existence.
All else equal, it seems objectionable to create minds and place them in situations where their interests are only somewhat fulfilled, if one could have easily provided them with better circumstances.
All else equal, it seems objectionable to create minds destined to constant misery, yet with a strict preference for existence over non-existence.
(While the first principle is about which minds to create, the second two principles apply to how to create new minds.)
Is it ever objectionable to fail to create minds – for instance, in cases where they’d have a strong interest in their existence?
This type of principle would go beyond bare-minimum population ethics. It would be demanding to follow in the sense that it doesn’t just tell us what not to do, but also gives us something to optimize (the creation of new happy people)– it would take up all our caring capacity.
Just because we care about fulfilling actual people’s life goals doesn’t mean that we care about creating new people with satisfied life goals. These two things are different. Total utilitarianism is a plausible or defensible version of a “full-scope” population ethical theory, but it’s not a theory that everyone will agree with. Alternatives like average utilitarianism or negative utilitarianism are on equal footing. (As are non-utillitarian approaches to population ethics that say that the moral value of future civilization is some complex function that doesn’t scale linearly with increased population size.)
So what should we make of moral theories such as total utilitarianism, average utilitarianism or negative utilitarianism? They way I think of them, they are possible morality-inspired personal preferences, rather than personal preferences inspired by the correct all-encompassing morality. In other words, a total/average/negative utilitarian is someone who holds strong moral views related to the creation of new people, views that go beyond the bare-minimum principles discussed above. Those views are defensible in the sense that we can see where such people’s inspiration comes from, but they are not objectively true in the sense that those intuitions will appeal in the same way to everyone.
How should people with different population-ethical preferences approach disagreement?
One pretty natural and straightforward approach would the proposal in the original post here.
Ironically, this would amount to “solving” population ethics in a way that’s very similar to how common sense would address it. Here’s how I’d imagine non-philosophers to approach population ethics:
Parents are obligated to provide a very high standard of care for their children (bare-minimum principle).
People are free to decide against becoming parents (principle inspired by personal morality).
Parents are free to want to have as many children as possible (principle inspired by personal morality), as long as the children are happy in expectation (bare-minimum principle).
People are free to try to influence other people’s stances and parenting choices (principle inspired by personal morality), as long as they remain within the boundaries of what is acceptable in a civil society (bare-minimum principle).
For decisions that are made collectively, we’ll probably want some type of democratic compromise.
I get the impression that a lot of effective altruists have negative associations with moral theories that leave things underspecified. But think about what it would imply if nothing was underspecfied: As Bernard Williams pointed out, if the true morality left nothing underspecified, then morally-inclined people would have no freedom to choose what to live for. I no longer think it’s possible or even desirable to find such an all-encompassing morality.
One may object that the picture I’m painting cheapens the motivation behind some people’s strongly held population-ethical convictions. The objection could be summarized this way: “Total utilitarians aren’t just people who self-orientedly like there to be a lot of happiness in the future! Instead, they want there to be a lot of happiness in the future because that’s what they think makes up the most good.”
I think this objection has two components. The first component is inspired by a belief in moral realism, and to that, I’d reply that moral realism is false. The second component of the objection is an important intuition that I sympathize with. I think this intuition can still be accommodated in my framework. This works as follows: What I labelled “principle inspired by personal morality” wasn’t a euphemism for “some random thing people do to feel good about themselves.” People’s personal moral principles can be super serious and inspired by the utmost desire to do what’s good for others. It’s just important to internalize that there isn’t just one single way to do good for others. There are multiple flavors of doing good.
Thank you very much for this comment, it explained my thoughts better than I could have ever written.
Yes, I think moral realism is false and didn’t realize that was not a mainstream position in the EA community. I had trouble accepting it myself for the longest time and I was incredibly frustrated that all evidence seemed to point away from moral realism. Eventually I realized that freedom could only exist in the arbitrary and that a clockwork moral code would mean a clockwork life.
I’m only a first-year student so I’ll be very interested in seeing what a professional (like yourself) could extrapolate from this idea. The rough draft you showed me is already very promising and I hope you get around to eventually making a post about it.
I’m not entirely sure what moral realism even gets you. Regardless of whether morality is “real” i still have attitudes towards certain behaviors and outcomes, and attitudes towards other people’s attitudes. I suspect the moral realism debate is confused altogether.
Here’s what I wrote in Six Plausible Meta-Ethical Alternatives: “Most intelligent beings in the multiverse share similar preferences. This came about because there are facts about what preferences one should have, just like there exist facts about what decision theory one should use or what prior one should have, and species that manage to build intergalactic civilizations (or the equivalent in other universes) tend to discover all of these facts. There are occasional paperclip maximizers that arise, but they are a relatively minor presence or tend to be taken over by more sophisticated minds.”
In the above scenario, once you become intelligent enough and philosophically sophisticated enough, you’ll realize that your current attitudes are wrong (or right, as the case may be) and change them to better fit the relevant moral facts.
I mean this could very well be true, but at best it points to some truths about convergent psychological evolution.
Sure, there are facts about what preferences would best enable the emergence of an intergalactic civilization. I struggle to see these as moral facts.
Also there’s definitely a manifest destiny evoking unquestioned moralizing of space exploration going on rn, almost like morality’s importance is only as an instrument to us becoming hegemonic masters of the universe. The angle you approached this question is value-laden in an idiosyncratic way (not in a particularly foreign way, here on less-wrong, but value-laden nonetheless)
One can recognize that one would be ”better off” with a different preference set without the alternate set being better in some objective sense.
I’m saying the self-reflective process that leads to increased parsimony between moral intuitions does not require objective realism of moral facts, or even the belief in moral realism. I guess this puts me somewhere between relativism and subjectivism according to your linked post?
There’s a misunderstanding/miscommunication here. I wasn’t suggesting “what preferences would best enable the emergence of an intergalactic civilization” are moral facts. Instead I was suggesting in that scenario that building an intergalactic civilization may require a certain amount of philosophical ability and willingness/tendency to be motivated by normative facts discovered through philosophical reasoning, and that philosophical ability could eventually enables that civilization to discover and be motivated by moral facts.
In other words, it’s [high philosophical ability/sophistication causes both intergalactic civilization and discovery of moral facts], not [discovery of “moral facts” causes intergalactic civilization].
Well, i struggle to articulate what exactly we disagree on, because I find no real issue with this comment. Maybe i would say “high philosophical ability/sophistication causes both intergalactic civilization and moral convergence.”? I hesitate to call the result of that moral convergence “moral fact,” though I can conceive of that convergence.
It gets you something that error theory doesn’t get you , which is that moral claims have truth values. And it gets you something that subjectivism doesn’t get you, which is some people being actually wrong, and not just different to you.
That’s parallel to pointing out that people still have opinions when objective truth is available. People should believe the truth (this site, passim) and similarly should follow the true morality.
uh… I guess cannot get around the regress involved in claiming my moral values superior to competing systems in an objective sense? I hesitate to lump together the same kind of missteps that are involved with a mistaken conception of reality (a mis-apprehension of non-moral facts) with whatever goes on internally when two people arrive at different values.
I think it’s possible to agree on all mind independent facts, without entailing perfect accord on all value propositions, and that moral reflection is fully possible without objective moral truth. Perhaps I do not get to point at a repulsive actor and say they are wrong in the strict sense of believing falsehoods, but i can deliver a verdict on their conduct all the same.
It looks like some people can, since the attitudes of professional philosophers break down as:
Meta-ethics: moral realism 56.4%; moral anti-realism 27.7%; other 15.9%.
I can see how the conclusion would be difficult to reach if you make assumptions that are standard round here, such as
Morality is value
Morality is only value
All value is moral value.
But I suppose other people are making other assumptions.
Some verdicts lead to jail sentences. If Alice does something that is against Bob’s subjective value system, and Bob does something that is against Alice’s subjective value system, who ends up in jail? Punishments are things that occur objectively, so need an objective justification.
Subjective ethics allows you to deliver a verdict in the sense of “tut-tutting”, but morality is something that connects up with laws and punishments, and that where subjectivism is weak.
To make Wei Dai’s answer more concrete, suppose something like the symmetry theory of valence is true; in that case, there’s a crisp, unambiguous formal characterization of all valence. Then add open individualism to the picture, and it suddenly becomes a lot more plausible that many civilizations converge not just towards similar ethics, but exactly identical ethics.
I’m immensely skeptical that open individualism will ever be more than a minority position (among humans, at least) But at any rate, convergence on an ethic doesn’t demonstrate objective correctness of that ethic from outside that ethic.
My impression is that moral realism based on irreducible normativity is more common in the broader EA community than on Lesswrong. But it comes in different versions. I also tend to refer to it as (a version of) “moral realism” if someone holds the belief that humans will reach a strong consensus about human values / normative ethical theories (if only they had ample time to reflect on the questions). Such convergence doesn’t necessarily require there to be irreducibly normative facts about what’s good or bad, but it still sounds like moral realism. The “we strongly expect convergence” position seemed to be somewhat prelevant on Lesswrong initially, though my impression was that this was more of a probable default assumption rather than something anyone confidently endorsed, and over time my impression is also that people have tentatively moved away from it.
I’m usually bad at explaining my thoughts too, but I’m persistent enough to keep trying. :P
Consider the system “do what you want”. While we might not accept this system completely (perhaps rejecting that it is okay to harm others if you don’t care about their wellbeing), it is an all encompassing system, and it gives you complete freedom (including choosing what to live for).
You’re right that the system of ‘do what you want’ is an all-encompassing system. But it also leaves a lot of things underspecified (basically everything), which was (in my opinion) the more important insight.
Humans are not utilitarians, so any kind of utilitarianism is a proxy for what one really wants to achieve, and so pushing it far enough means the tails coming apart and such. You are adding “voting” as a way to close the feedback loop from the unknown values to the metric, but that seems like a sledgehammer, you can just take the vote for each particular example, no need to perform a complicated mixed utilitarian calculation.
Imagine a universe full of Robin Hanson lookalikes (all total utilitarians) that desperately want to kickstart the age of em (the repugnant conclusion). The dictator of this universe is a median utilitarian that uses black magic and nano-bots to euthanize all depressed people and sabotage any progress towards the age of em. Do you think that in this case the dictator should ideally change his behavior as to maximize the meta-preferences of his citizens?
My understanding of average vs total utilitarianism doesn’t yet tell me which one to vote for. You could ask me to vote anyway, but is there a reason why voting would give high quality answers to such questions?
We are talking about a hypothetical vote here, where we could glean people’s underlying preferences. Not what people think they want (people get that wrong all the time) but their actual utility function. This leaves us with three options:
1) You do not actually care about how we aggregate utility, this would result in an ambivalence score of 0
2) You do have an underlying preference that you just don’t know consciously, this means your underlying preference gets counted.
3) You do care about how we aggregate utility, but aren’t inherently in favor of either average or total. So when we gauge your ambivalence we see that you do care (1 or something high), but you really like both average (e.g 0,9) and total (e.g 0,9) with other methods like median and mode getting something low (like e.g 0,1)
In all cases the system works to accommodate your underlying preferences.
If you haven’t seen it, you may find this paper interesting: Geometric reasons for normalising variance to
aggregate preferences, by Owen Cotton-Barratt (as an example of another potentially elegant approach to aggregating preferences).
I’m not convinced that utility aggregation can’t be objective.
We want to aggregate utilities because of altruism and because it’s good for everyone if everyone’s AI designs aggregate utilities. Altruism itself is an evolutionary adaptation with similar decision-theoretic grounding. Therefore if we use decision theory to derive utility aggregation from first principles, I expect a method to fall out for free.
Imagine that you find yourself in control of an AI with the power to seize the universe and use it as you command. Almost everyone, including you, prefers a certainty of an equal share of the universe to a lottery’s chance at your current position. Your decision theory happens to care not only about your current self, but also about the yous in timelines where you didn’t manage to get into this position. You can only benefit them acausally, by getting powerful people in those timelines to favor them. Therefore you look for people that had a good chance of getting into your position. You use your cosmic power to check their psychology for whether they would act as you are currently acting had they gotten into power, and if so, you go reasonably far to satisfy their values. This way, in the timeline where they are in power, you are also in a cushy position.
This scenario is fortunately not horrifying for those who never had a chance to get into your position, because chances are that someone that you gave ressources directly or indirectly cares about them. How much everyone gets is now just a matter of acausal bargaining and the shape of their utility returns in ressources granted.
This seems unnecessary. Ambivalent means the weight given to the different options is a 1:1 ratio.
What should the voting method to start with be?
EDIT: This other comment from the OP suggests that ratios aren’t taken into account, and ambivalence is accounted by asking that as a question; https://www.lesswrong.com/posts/2NTSQ5EZ8ambPLxjy/meta-preference-utilitarianism#j4MMqj4aFdzjmJF5A
I mentioned the utilitarian voting method, also known as score voting. This is the most accurate way to gauge peoples preferences (especially if the amount of nuance is unbounded e.g 0,827938222...) if you don’t have to deal with people voting strategically (which would be the case if we were just checking people’s utility function)
EDIT: Or maybe not? I’m not an expert on social choice theory, but I’m not entirely confident that Bayesian regret is the best metric anymore. So if a social choice theorist thinks I made a mistake, please let me know.
So instead of directly maximising any particular method of aggregating utility, the proposal seems to be that we should maximise how satisfied people are, in aggregate, with the aggregating method being maximised?
But should we maximise total satisfaction with the utility-aggregating method being maximised, or average satisfaction with that aggregating method?
And is it preferable to have a small population who are very satisfied with the utility aggregation method, or a much larger population who think the utility aggregation method is only getting it right slightly more often than chance?
Needs another layer of meta
(on a second look I see that you did indeed suggest voting on any such problems)
I think those are the same if the people whose votes are counted are only the people who already exist or will exist regardless of one’s choices. Total utilitarianism and average utilitarianism come apart on the question of how to count votes by people who are newly brought into existence.
I agree that this needs an answer. Personally, I think the proposal in question makes a lot of sense in combination with an approach that’s focused on the preferences of already existing people.
[Edit: the following example is bad. I might rewrite my thoughts about meta-preferentialism in the future, in which case I will write a better example and link to it here]
I did answer that question (albeit indirectly) but let me make it explicit.
Because of score voting the issue between total and average-aggregating is indeed dissolved (even with a fixed population)
Now I will note that in the case of the second problem score voting will also solve this the vast majority of the time, but let’s look at a (very) rare case where it would actually be a tie:
Alice and Bob want: Total (0,25), Average (1), Median (0)
Cindy and Dan want: Total (0,25), Average (0), Median (1)
And Elizabeth wants: Total (1), Average (0), Median (0)
So the final score is: Total (2), Average (2), Median (2)
(Note that for convenience I assume that this is with the ambivalence factor already calculated in)
In this case only one person is completely in favor of total with the others being lukewarm to it, but with a very strong split among the average-median question (Yes this is a very bizarre scenario)
Now numerically these all have the same preference, so the next question becomes: what do we pursue? This could be solved with a score vote too: How strong is your preference for:
(1) Picking one strategy at random (2) Pursuing all strategies 33% of the time (3) Picking the method that the least amount of people gave a zero (4) Only pursuing the methods that more than one person gave a 1 proportionally …etc, etc...
But what if, due to some unbelievable cosmic coincidence, that next vote also ends in a tie?
Well you go up one more level until either the ambivalence takes over (I doubt I would care after 5 levels of meta) or until there is a tie-breaker. Although it is technically possible to have a tie in an infinite amount of meta-levels, in reality this will never happen.
And yes you go as many levels of meta as needed to solve the problem. I only call it ‘meta-preference utilitarianism’ because ‘gauging-a-potentially-infinite-amount-of-meta-preferences utilitarianism’ isn’t quite as catchy.
I expect that this system, like democratic processes in general, would have problems because nearly all people, even on an individual level, don’t have clearly defined ideas of what they want to optimize. I’d expect a relatively small number of people to develop some understanding of the various choices and their effects, then the emergence of various political campaigns to promote one or another preference (for reasons idealistic and otherwise). I’d expect tribalism to set in rather quickly.
This is talking about the underlying preferences, not the surface level preferences. It’s an abstract moral system where we try to optimize people’s utility function, not a concrete political one where we ask people what they want.
I got that; I just don’t think most people have identifiable meta-preferences. I don’t. I expect less than half of Americans would quickly understand the concept of “meta-preferences”, and I’m pretty sure that God features prominently in the moral reasoning of most Americans (but perhaps not most Californians).
OTOH, I’m sure that many people have identifiable preferences and that some people are smart enough to work backwards. Somebody’s going to figure out which meta-preference leads to a lower tax rate and tell Fox News.
Voting relies on human judgement, which gets increasingly shaky the farther it gets from the humans’ concrete concerns. I think your approach magnifies the problems of democracy rather than solving them.