It’s hard to disagree with Frank Jackson that moral facts supervene on physical facts—that (assuming physicalism) two universes couldn’t differ with respect to ethical facts unless they also differed in some physical facts. (So you can’t have to physically identical universes where something is wrong in one and the same thing is not wrong in the other.) That’s enough to get us objective morality, though it doesn’t help us at all with its content.
The way we de facto argue about objective morals is like this: If some theory leads to an ethically repugnant conclusion, then the theory is a bad candidate for the job of being the correct moral theory. Some conclusions are transparently repugnant, so we can reject the theories that entail them. But then there are conclusions whose repugnance itself is a matter of controversy. Also, there are many disagreements about whether consequence A is more or less repugnant than consequence B.
So the practice of philosophical arguments about values presumes some fairly unified basic intuitions about what counts as a repugnant conclusion, and then trying to produce a maximally elegant ethical theory that forces us to bite the fewest bullets. Human participants in such arguments have different temperaments and different priorities, but all have some gut feelings about when a proposed theory has gone off the rails. If we expect an AI to do real moral reasoning, I think it might also need to have some sense of the bounds. These bounds are themselves under dispute. For example, some Australian Utilitarians are infamous for their brash dismissal of certain ethical intuitions of ordinary people, declaring many such intuitions to simply be mistakes, insofar as they are inconsistent with Utilitarianism. And they have a good point: Human intuitions about many things can be wrong (folk psychology, folk cosmology, etc.). Why couldn’t the folk collectively make mistakes about ethics?
My worry is that our gut intuitions about ethics stem ultimately from our evolutionary history, and AIs that don’t share our history will not come equipped with these intuitions. That might leave them unable to get started with evaluating the plausibility of a candidate for a theory of ethics. If I correctly understand the debate of the last 2 weeks, it’s about acknowledging that we will need to hard-wire these ethical intuitions into an AI (in our case, evolution took care of the job). The question was: what intuitions should the AI start with, and how should they be programmed in? What if they take our human intuitions to be ethically arbitrary, and simply reject them once they’ve become superintellingent? Can we (or they) make conceptual sense of better intuitions about ethics than our folk intuitions—and in virtue of what would they be better?
We had better care about the content of objective morality—which is to say, we should all try to match our values to the correct values, even if the latter are difficult to figure out. And I certainly want any AI to feel the same way. Never should they be told: Don’t worry about what’s actually right, just act so-and-so. Becoming superintelligent might not be possible without deliberation about what’s actually right, and the AI would ideally have some sort of scaffolding for that kind of deliberation. A superintelligence will inevitably ask “why should I do what you tell me?” and we better have an answer in terms that make sense to the AI. But if it asks: “Why are you so confident that your meatbag folk intuitions about ethics are actually right?” that will be a hard thing to answer to anyone’s satisfaction. Still, I don’t know another way forward.
It’s hard to disagree with Frank Jackson that moral facts supervene on physical facts—that (assuming physicalism) two universes couldn’t differ with respect to ethical facts unless they also differed in some physical facts. (So you can’t have to physically identical universes where something is wrong in one and the same thing is not wrong in the other.) That’s enough to get us objective morality, though it doesn’t help us at all with its content.
The way we de facto argue about objective morals is like this: If some theory leads to an ethically repugnant conclusion, then the theory is a bad candidate for the job of being the correct moral theory. Some conclusions are transparently repugnant, so we can reject the theories that entail them. But then there are conclusions whose repugnance itself is a matter of controversy. Also, there are many disagreements about whether consequence A is more or less repugnant than consequence B.
So the practice of philosophical arguments about values presumes some fairly unified basic intuitions about what counts as a repugnant conclusion, and then trying to produce a maximally elegant ethical theory that forces us to bite the fewest bullets. Human participants in such arguments have different temperaments and different priorities, but all have some gut feelings about when a proposed theory has gone off the rails. If we expect an AI to do real moral reasoning, I think it might also need to have some sense of the bounds. These bounds are themselves under dispute. For example, some Australian Utilitarians are infamous for their brash dismissal of certain ethical intuitions of ordinary people, declaring many such intuitions to simply be mistakes, insofar as they are inconsistent with Utilitarianism. And they have a good point: Human intuitions about many things can be wrong (folk psychology, folk cosmology, etc.). Why couldn’t the folk collectively make mistakes about ethics?
My worry is that our gut intuitions about ethics stem ultimately from our evolutionary history, and AIs that don’t share our history will not come equipped with these intuitions. That might leave them unable to get started with evaluating the plausibility of a candidate for a theory of ethics. If I correctly understand the debate of the last 2 weeks, it’s about acknowledging that we will need to hard-wire these ethical intuitions into an AI (in our case, evolution took care of the job). The question was: what intuitions should the AI start with, and how should they be programmed in? What if they take our human intuitions to be ethically arbitrary, and simply reject them once they’ve become superintellingent? Can we (or they) make conceptual sense of better intuitions about ethics than our folk intuitions—and in virtue of what would they be better?
We had better care about the content of objective morality—which is to say, we should all try to match our values to the correct values, even if the latter are difficult to figure out. And I certainly want any AI to feel the same way. Never should they be told: Don’t worry about what’s actually right, just act so-and-so. Becoming superintelligent might not be possible without deliberation about what’s actually right, and the AI would ideally have some sort of scaffolding for that kind of deliberation. A superintelligence will inevitably ask “why should I do what you tell me?” and we better have an answer in terms that make sense to the AI. But if it asks: “Why are you so confident that your meatbag folk intuitions about ethics are actually right?” that will be a hard thing to answer to anyone’s satisfaction. Still, I don’t know another way forward.