Stuart_Armstrong comments on For the past, in some ways only, we are moral degenerates

Stuart_Armstrong 18 Jun 2019 7:36 UTC
LW: 2 AF: 1
AF

What scares me is the possibility that moral anti-realism is false, but we build an AI under the assumption that it’s true

One way of dealing with this, in part, is to figure out what would convince you that moral realism was true, and put that in as a strong conditional meta-preference.
- Wei Dai 18 Jun 2019 8:39 UTC
  LW: 6 AF: 2
  AF Parent
  I can see two possible ways to convince me that moral realism is true:
  1. I spend hundreds or more years in a safe environment with a bunch of other philosophically minded people and we try to come up with arguments for and against moral realism, counterarguments, counter-counterarguments and so on, and we eventually exhaust the space of such arguments and reach a consensus that moral realism is true.
  2. We solve metaphilosophy, program/teach an AI to “do philosophy”, somehow reach high confidence that we did that correctly, and the AI solves metaethics and gives us a convincing argument that moral realism is true.
  Do these seem like things that could be “put in as a strong conditional meta-preference” in your framework?
  - Stuart_Armstrong 19 Jun 2019 14:12 UTC
    LW: 2 AF: 1
    AF Parent
    
    Do these seem like things that could be “put in as a strong conditional meta-preference” in your framework?
    
    Yes, very easily.
    
    The main issue is whether these should count as an overwhelming meta-preference—one that over-weights all other considerations. And, currently as I have things set up, the answer is no. I have no doubt that you feel strongly about potentially true moral realism. But I’m certain that this strong feeling is not absurdly strong compared to other preferences at other moments in your life. So if we synthesised your current preferences, and 1. or 2. ended up being true, then the moral realism would end up playing a large-but-not-dominating role in your moral preferences.
    
    I wouldn’t want to change that, because what I’m aiming for is an accurate synthesis of your current preferences, and your current preference for moral-realism-if-it’s-true is not, in practice, dominating your preferences. If you wanted to ensure the potential dominance of moral realism, you’d have to put that directly into the synthesis process, as a global meta-preference (section 2.8 of the research agenda).
    
    But the whole discussion feels a bit peculiar, to me. One property of moral realism that is often assumed, is that it is, in some sense, ultimately convincing—that all systems of morality (or all systems derived from humans) will converge to it. Yet when I said a “large-but-not-dominating role in your moral preferences”, I’m positing that moral realism is true, but that we have a system of morality - $U_{H}$ - that does not converge to it. I’m not really grasping how this could be possible (you could argue that the moral realism $U_{R}$ is some sort of acausal trade convergent function, but that gives an instrumental reason to follow $U_{R}$ , not an actual reason to have $U_{R}$ ; and I know that a moral system need not be a utility function ^_^).
    
    So yes, I’m a bit confused by true-but-not-convincing moral realisms.