That’s a good observation, but it doesn’t completely solve the problem. The problem here is not just the trolley problem. The problem is that people disagree on whether not pushing the fat man is a value, or a bug. The trolley problem is just one example of the difficulty of determining this in general.
There is a large literature on the trolley problem, and on how to solve the trolley problem, and the view taken in the paper, which was arrived at by many experts after studying the problem and conducting polls and other research, is that humans have a moral value called the “principle of double effect”:
Harming another individual is permissible if it is the foreseen consequence of
an act that will lead to a greater good; in contrast, it is impermissible to harm
someone else as an intended means to a greater good.
Is this a value, or a bug? As long as we can’t all agree on that, there’s no reason to expect we can correctly figure out what are values and what are bugs.
There’s really two problems:
Come up with a procedure to determine whether a behavior is a value or an error.
Convince most other people in the world that your procedure is correct.
Personally, I think a reasonable first step is to try to restrict ethics to utilitarian approaches. We’ll never reach agreement as long as there are people still trying to use rule-based ethics (such as the “double effect” rule). The difficulty of getting most people to agree that there are no valid non-utilitarian ethical frameworks is just a small fraction of the difficulty of the entire program of agreeing on human values.
Perhaps I am missing something here, but I don’t see why utilitarianism is necessarily superior to rule-based ethics. An obvious advantage of a rule-based moral system is the speed of computation. Situations like the trolley problem require extremely fast decision-making. Considering how many problems local optima cause in machine learning and optimization, I imagine that it would be difficult for an AI to assess every possible alternative and pick the one which maximized overall utility in time to make such a decision. Certainly, we as humans frequently miss obvious alternatives when making decisions, especially when we are upset, as most humans would be if they saw a trolley about to crash into five people. Thus, having a rule-based moral system would allow us to easily make split-second decisions when such decisions were required.
Of course, we would not want to rely on a simple rule-based moral system all the time, and there are obvious benefits to utilitarianism when time is available for careful deliberation. It seems that it would be advantageous to switch back and forth between these two systems based on the time available for computation.
If the rules in a rule-based ethical system were derived from utilitarian concerns, and were chosen to maximize the expected utility over all situations to which the rule might be applied, would it not make sense to use such a rule-based system for very important, split-second decisions?
Yes, rule-based systems might respond faster, and that is sometimes preferable.
Let me back up. I categorize ethical systems into different levels of meta. “Personal ethics” are the ethical system an individual agent follows. Efficiency, and the agent’s limited knowledge, intelligence, and perspective, are big factors.
“Social ethics” are the ethics a society agrees on. AFAIK, all existing ethical theorizing supposes that these are the same, and that an agent’s ethics, and its society’s ethics, must be the same thing. This makes no sense; casual observation shows this is not the case. People have ethical codes, and they are seldom the same as the ethical codes that society tells them they should have. There are obvious evolutionary reasons for this. Social ethics and personal ethics are often at cross-purposes. Social ethics are inherently dishonest, because the most effective way of maximizing social utility is often to deceive people, We expect, for instance, that telling people there is a distinction between personal ethics and social ethics should be against every social ethics in existence.
(I don’t mean that social ethics are necessarily exploiting people. Even if you sincerely want the best outcome for people, and they have personal ethics such that you don’t need to deceive them into cooperating, many will be too stupid or in too much of a hurry to get good results if given full knowledge of the values that the designers of the social ethics were trying to optimize. Evolution may be the designer.)
“Meta-ethics” is honest social ethics—trying to figure out what we should maximize, in a way that is not meant for public consumption—you’re not going to write your conclusions on stone tablets and give them to the masses, who wouldn’t understand them anyway. When Eliezer talks about designing Friendly AI, that’s meta-ethics (I hope). And that’s what I’m referring to here when I talk about encoding human values into an AI.
Roughly, meta-ethics is “correct and thorough” ethics, where we want to know the truth and get the “right” answer (if there is one) about what to optimize.
Social ethics and agent are likely to be rule-based, and that may be appropriate. Meta-ethics is an abstract thing, carried out, eg., by philosophers in journal articles; and speed of computation is typically not an issue.
Any rule-based system can be transformed into a utilitarian system, but not vice-versa. Any system that can produce a choice between any two outcomes or actions imposes a complete ordering on all possible outcomes or actions, and is therefore utilitarian.
The problem is that people disagree on whether not pushing the fat man is a value, or a bug.
People do, but how much of that disagreement is between people who have been exposed to utilitarian and consequentialist moral philosophy, and people who have not? The linked article says:
Is it morally permissible for Ian to shove the man? … studies across cultures have been performed, and the consistent answer is reached that this is not morally permissible.
The key word is “consistent”. The article does not (in this quote, and as far as I can see) highlight the disagreement that you are talking about. I, of course, am aware of this disagreement—but a large fraction of the people that I discuss this topic with are utilitarians. What the quote from the article suggests to me is that, outside a minuscule population of people who have been exposed to utilitarianism, there is not significant disagreement on this point.
If this is the case, then utilitarianism may have created this problem, and the solution may be as simple as rejecting utilitarianism.
You stated a problem: how to get people to agree. You gave your solution to the problem here (my emphasis)
Personally, I think a reasonable first step is to try to restrict ethics to utilitarian approaches. We’ll never reach agreement as long as there are people still trying to use rule-based ethics (such as the “double effect” rule). The difficulty of getting most people to agree that there are no valid non-utilitarian ethical frameworks is just a small fraction of the difficulty of the entire program of agreeing on human values.
I pointed out, however, that it is apparently utilitarianism that has introduced the disagreement in the first place. I explained why that seems to be so. So the problem may be utilitarianism. If so, then the solution is to reject it.
How are you judging the validity of an ethical framework? Everything I’ve read on the subject (which is not a huge amount) assesses ethical systems by constructing intuition-pumping examples (such as the trolley problem, or TORTURE vs. SPECKS, or whatever), and inviting the reader to agree that such-and-such a system gives the right, or the wrong answer to such-and-such an example. But what ethical system produces these judgements, with respect to which other ethical systems are being evaluated?
That’s a good observation, but it doesn’t completely solve the problem. The problem here is not just the trolley problem. The problem is that people disagree on whether not pushing the fat man is a value, or a bug. The trolley problem is just one example of the difficulty of determining this in general.
There is a large literature on the trolley problem, and on how to solve the trolley problem, and the view taken in the paper, which was arrived at by many experts after studying the problem and conducting polls and other research, is that humans have a moral value called the “principle of double effect”:
Is this a value, or a bug? As long as we can’t all agree on that, there’s no reason to expect we can correctly figure out what are values and what are bugs.
There’s really two problems:
Come up with a procedure to determine whether a behavior is a value or an error.
Convince most other people in the world that your procedure is correct.
Personally, I think a reasonable first step is to try to restrict ethics to utilitarian approaches. We’ll never reach agreement as long as there are people still trying to use rule-based ethics (such as the “double effect” rule). The difficulty of getting most people to agree that there are no valid non-utilitarian ethical frameworks is just a small fraction of the difficulty of the entire program of agreeing on human values.
Perhaps I am missing something here, but I don’t see why utilitarianism is necessarily superior to rule-based ethics. An obvious advantage of a rule-based moral system is the speed of computation. Situations like the trolley problem require extremely fast decision-making. Considering how many problems local optima cause in machine learning and optimization, I imagine that it would be difficult for an AI to assess every possible alternative and pick the one which maximized overall utility in time to make such a decision. Certainly, we as humans frequently miss obvious alternatives when making decisions, especially when we are upset, as most humans would be if they saw a trolley about to crash into five people. Thus, having a rule-based moral system would allow us to easily make split-second decisions when such decisions were required.
Of course, we would not want to rely on a simple rule-based moral system all the time, and there are obvious benefits to utilitarianism when time is available for careful deliberation. It seems that it would be advantageous to switch back and forth between these two systems based on the time available for computation.
If the rules in a rule-based ethical system were derived from utilitarian concerns, and were chosen to maximize the expected utility over all situations to which the rule might be applied, would it not make sense to use such a rule-based system for very important, split-second decisions?
Yes, rule-based systems might respond faster, and that is sometimes preferable.
Let me back up. I categorize ethical systems into different levels of meta. “Personal ethics” are the ethical system an individual agent follows. Efficiency, and the agent’s limited knowledge, intelligence, and perspective, are big factors.
“Social ethics” are the ethics a society agrees on. AFAIK, all existing ethical theorizing supposes that these are the same, and that an agent’s ethics, and its society’s ethics, must be the same thing. This makes no sense; casual observation shows this is not the case. People have ethical codes, and they are seldom the same as the ethical codes that society tells them they should have. There are obvious evolutionary reasons for this. Social ethics and personal ethics are often at cross-purposes. Social ethics are inherently dishonest, because the most effective way of maximizing social utility is often to deceive people, We expect, for instance, that telling people there is a distinction between personal ethics and social ethics should be against every social ethics in existence.
(I don’t mean that social ethics are necessarily exploiting people. Even if you sincerely want the best outcome for people, and they have personal ethics such that you don’t need to deceive them into cooperating, many will be too stupid or in too much of a hurry to get good results if given full knowledge of the values that the designers of the social ethics were trying to optimize. Evolution may be the designer.)
“Meta-ethics” is honest social ethics—trying to figure out what we should maximize, in a way that is not meant for public consumption—you’re not going to write your conclusions on stone tablets and give them to the masses, who wouldn’t understand them anyway. When Eliezer talks about designing Friendly AI, that’s meta-ethics (I hope). And that’s what I’m referring to here when I talk about encoding human values into an AI.
Roughly, meta-ethics is “correct and thorough” ethics, where we want to know the truth and get the “right” answer (if there is one) about what to optimize.
Social ethics and agent are likely to be rule-based, and that may be appropriate. Meta-ethics is an abstract thing, carried out, eg., by philosophers in journal articles; and speed of computation is typically not an issue.
Any rule-based system can be transformed into a utilitarian system, but not vice-versa. Any system that can produce a choice between any two outcomes or actions imposes a complete ordering on all possible outcomes or actions, and is therefore utilitarian.
People do, but how much of that disagreement is between people who have been exposed to utilitarian and consequentialist moral philosophy, and people who have not? The linked article says:
The key word is “consistent”. The article does not (in this quote, and as far as I can see) highlight the disagreement that you are talking about. I, of course, am aware of this disagreement—but a large fraction of the people that I discuss this topic with are utilitarians. What the quote from the article suggests to me is that, outside a minuscule population of people who have been exposed to utilitarianism, there is not significant disagreement on this point.
If this is the case, then utilitarianism may have created this problem, and the solution may be as simple as rejecting utilitarianism.
And here I thought you were going to conclude that this showed that the majority reaction was in error.
You stated a problem: how to get people to agree. You gave your solution to the problem here (my emphasis)
I pointed out, however, that it is apparently utilitarianism that has introduced the disagreement in the first place. I explained why that seems to be so. So the problem may be utilitarianism. If so, then the solution is to reject it.
How are you judging the validity of an ethical framework? Everything I’ve read on the subject (which is not a huge amount) assesses ethical systems by constructing intuition-pumping examples (such as the trolley problem, or TORTURE vs. SPECKS, or whatever), and inviting the reader to agree that such-and-such a system gives the right, or the wrong answer to such-and-such an example. But what ethical system produces these judgements, with respect to which other ethical systems are being evaluated?
That’s the question I’m asking, not the question I’m answering. :)