Wei Dai comments on Synthesising divergent preferences: an example in population ethics

Wei Dai 23 Jan 2019 4:15 UTC
2 points

The normalisation is “human-realistic”, in that the agent is estimating “the best they themselves could do” vs “the worse they themselves could do”.

But this means the normalization depends on how capable the human is, which seems strange, especially in the context of AI. In other words, it doesn’t make sense that an AI would obtain different values from two otherwise identical humans who differ only in how capable they are.

I am entirely convinced that there are no such things.

In a previous post, you didn’t seem this certain about moral anti-realism:

Even if the moral realists are right, and there is a true R, thinking about it is still misleading. Because there is, as yet, no satisfactory definition of this true R, and it’s very hard to make something converge better onto something you haven’t defined. Shifting the focus from the unknown (and maybe unknowable, or maybe even non-existent) R, to the actual P, is important.

Did you move further in the anti-realist direction since then? If so, why?

There are maps from {lists of assumptions + human behaviour + elements of the human internal process} to sets of values, but different assumptions will give different values, and we have no principled way to distinguish between them, except for using our own contradictory and underdefined meta-preferences.

I agree this is the situation today, but I don’t see how we can be so sure that it won’t get better in the future. Philosophical progress is a thing, right?
- Stuart_Armstrong 23 Jan 2019 12:30 UTC
  2 points
  Parent
  
  But this means the normalization depends on how capable the human is, which seems strange, especially in the context of AI.
  
  The min-max normalisation is supposed to measure how much a particular utility function “values” the human moving from being a u-antagonist to a u-maximiser. The full impact of that change is included; so if the human is about to program an AI, the effect is huge. You might see it as the AI asking “utility u—maximise, yes or no?”, and the spread between “yes” and “no” is normalised.
  
  Did you move further in the anti-realist direction since then? If so, why?
  
  How I describe my position can vary a lot. Essentially I think that there might be a partial order among sets of moral axioms, in that it seems plausible to me that you could say that set A is almost-objectively better than set B (more rigorously: according to criteria c, A>B, and criteria c seems a very strong candidate for an “objectively true” axiom; something comparable to the basic properties of equality https://en.wikipedia.org/wiki/Equality_(mathematics)#Basic_properties ).
  
  But it seems clear there is not going to be a total order, nor a maximum element.
  
  I agree this is the situation today, but I don’t see how we can be so sure that it won’t get better in the future. Philosophical progress is a thing, right?
  
  Progress in philosophy involves uncovering true things, not making things easier; mathematics is a close analogue. For example, computational logic would have been a lot simpler if in fact there existed an algorithm that figured out if a given Turing machine would halt. The fact that Turing’s result made everything more complicated didn’t mean that it was wrong.
  
  Similarly, the only reason to expect that philosophy would discover moral realism to be true, is if we currently had strong reasons to suppose that moral realism is true.