Wei Dai comments on The Urgent Meta-Ethics of Friendly Artificial Intelligence

Wei Dai 2 Feb 2011 3:50 UTC
17 points

how is it possible that Eliezer’s “right” doesn’t designate anything

Eliezer identifies “right” with “the ideal morality that I would have if I heard all the arguments, to whatever extent such an extrapolation is coherent.” It is possible that human morality, when extrapolated, shows no coherence, in which case Eliezer’s “right” doesn’t designate anything.

how could you arrive at such a strong conclusion based on his non-technical writings, since he could just mean something different, or could have insufficient precision in his own idea to determine this property

Are you saying that Eliezer’s general approach might still turn out to be correct, if we substitute better definitions or understandings of “extrapolation” and/or “coherence”? If so, I agree, and I didn’t mean to exclude this possibility with my original statement. Should I have made it clearer when I said “I think Eliezer’s meta-ethics is wrong” that I meant “based on my understanding of Eliezer’s current ideas”?
- Vladimir_Nesov 2 Feb 2011 4:28 UTC
  4 points
  Parent
  
  It is possible that human morality, when extrapolated, shows no coherence
  
  For example, I have no idea what this means. I don’t know what “extrapolated” means, apart from some vague intuitions, and even what “coherent” means.
  
  Are you saying that Eliezer’s general approach might still turn out to be correct, if we substitute better definitions or understandings of “extrapolation” and/or “coherence”?
  
  Better than what? I have no specific adequate candidates, only a direction of research.
  - Wei Dai 2 Feb 2011 20:52 UTC
    3 points
    Parent
    
    It is possible that human morality, when extrapolated, shows no coherence
    
    For example, I have no idea what this means.
    
    Did you read the thread I linked to in my opening comment, where Marcello and I argued in more detail why we think that? Perhaps we can move the discussion there, so you can point out where you disagree with or not understand us?
    - Vladimir_Nesov 2 Feb 2011 21:38 UTC
      4 points
      Parent
      To respond to that particular argument, which I don’t see how substantiates the point that morality according to Eliezer’s meta-ethics could be void.
      
      When you’re considering what a human mind would conclude upon considering certain new arguments, you’re thinking of ways to improve it. A natural heuristic is to add opportunity for reflection, but obviously exposing one to “unbalanced” argument can lead a human mind anywhere. So you suggest a heuristic of looking for areas of “coherence” in conclusions reached upon exploration of different ways of reflecting.
      
      But this “coherence” is also merely a heuristic. What you want is to improve the mind in the right way, not in coherent way, or balanced way. So you let the mind reflect on strategies for exposing itself to more reflection, and then on strategies for reflecting on reflecting on strategies for getting more reflection, and so on, in any way deemed appropriate by the current implementation. There’s probably no escaping this unguided stage, for the most right guide available is the agent itself (unfortunately).
      
      What you end up with won’t have opportunity to “regret” past mistakes, for every regret is recognition of an error, and any error can be corrected (for the most part). What’s wrong with “incoherent” future growth? Does lack of coherence indicate a particular error, something not done right? If it does, that could be corrected. If it doesn’t, everything is fine.
      
      (By the way, this argument could potentially place advanced human rationality and human understanding of decision theory and meta-ethics directly on track to a FAI, with the only way of making a FAI using a human (upload) group self-improvement.)
      - Wei Dai 2 Feb 2011 22:56 UTC
        5 points
        Parent
        I believe that in Eliezer’s meta-ethics, both the extrapolation procedure and the coherence property are to be given fixed logical definitions as part of the meta-ethics, and are not just “heuristics” to be freely chosen by the subject being extrapolated. You seem to be describing your own ideas, which are perhaps similar enough to Eliezer’s to be said to fall under his general approach, but I don’t think can be said to be Eliezer’s meta-ethics.
        
        making a FAI using a human (upload) group self-improvement
        
        Seems like a reasonable idea, but again, almost surely not what Eliezer intended.
        Vladimir_Nesov 2 Feb 2011 23:03 UTC
        1 point
        Parent
        
        I believe that in Eliezer’s meta-ethics, both the extrapolation procedure and the coherence property are to be given fixed logical definitions as part of the meta-ethics, and are not just “heuristics” to be freely chosen by the subject being extrapolated.
        
        Why “part of meta-ethics”? That would make sense as part of FAI design. Surely the details are not to be chosen “freely”, but still there’s only one criterion for anything, and that’s full morality. For any fixed logical definition, any element of any design, there’s a question of what could improve it, make the consequences better.
        Wei Dai 2 Feb 2011 23:10 UTC
        4 points
        Parent
        
        Why “part of meta-ethics”?
        
        I think because Eliezer wanted to ensure a good chance that right_Eliezer and right_random_human turn out to be very similar. If you let each person choose how to extrapolate using their own current ideas, you’re almost certainly going to end up with very different extrapolated moralities.
        Vladimir_Nesov 2 Feb 2011 23:28 UTC
        2 points
        Parent
        The point is not that they’ll be different, but that mistakes will be made, making the result not quite right, or more likely not right at all. So on the early stage, one must be very careful, develop a reliable theory of how to proceed instead of just doing stuff at random, or rather according to current human heuristics.
        
        Extended amount of reflection looks like one least invasive self-improvement technique, something that’s expected to make you more reliably right, especially if you’re given opportunity to decide how the process is to be set up. This could get us to the next stage, and so on. More invasive heuristics can prove too disruptive, wrong in unexpected and poorly-understood ways, so that one won’t be able to expect the right outcome without close oversight from a moral judgment, which we don’t have in any technically strong enough form as of yet.
        Wei Dai 3 Feb 2011 20:04 UTC
        6 points
        Parent
        Suppose you have the intuition that extended reflection and coherence are good heuristics to guide your extrapolation. I, on the other hand, think that extended reflection as a base human is dangerous, and coherence has nothing to do with what’s right. I’d rather that the extrapolated me experiment with self-modification after only a moderate amount of theorizing, and at the end merge with its counter-factual versions through acausal negotiation.
        
        Suppose further that you end up in control of FAI design, and you want it to take my morality into account. Would you have it extrapolate me using your preferred method, or mine?
        Vladimir_Nesov 4 Feb 2011 14:15 UTC
        4 points
        Parent
        
        Suppose you have the intuition that extended reflection and coherence are good heuristics to guide your extrapolation. I, on the other hand, think that extended reflection as a base human is dangerous, and coherence has nothing to do with what’s right.
        
        What these heuristics discuss are ways of using more resources. The resources themselves are heuristically assumed to be useful, and so we discuss how to use them best.
        
        (Now to slip to an object-level argument for a change.)
        
        Notice the “especially if you’re given opportunity to decide how the process is to be set up” in my comment. I agree that unnaturaly extended reflection is dangerous, we might even run into physiological problems with computations in the brains that are too chronologically old. But 50 years is better that 6 months, even if both 50 years and 6 months are dangerous. And if you actually work on planning these reflection sessions, so that you can set up groups of humans to work for some time, then maybe resetting them and only having them pass their writings to new humans, filtering such findings using not-older-than-50 humans trained on more and more improved findings and so on. For most points you could raise with the reason it’s dangerous, we could work on finding a solution for that problem. For any experiment with FAI design, we would be better off thinking about it first.
        
        Likewise, if you task 1000 groups of humans to work on coming up with possible strategies for using the next batch of computational resources (not for doing most good explicitly, but for developing even better heuristic understanding of the problem), and you use the model of human research groups as having a risk of falling into reflective death spirals where all members of a group can fall to memetic infection that gives no answers to the question they considered, then it seems like a good heuristic to place considerably less weight on suggestions that come up very rarely and don’t get supported by some additional vetting process.
        
        For example, the first batches of research could focus on developing effective training programs in rationality, then in social engineering, voting schemes, and so on. Overall architecture of future human-level meta-ethics necessary for more dramatic self-improvement (or improvement in the methods of having things done, such as using a non-human AI or science of deep non-human moral calculations) would come much later.
        
        In short, I’m not talking of anything that opposes the strategies you named, so you’d need to point to incurable problems that make the strategy of thinking more about the problem lead to worse results than randomly making stuff up (sorry!).
        
        and at the end merge with its counter-factual versions through acausal negotiation.
        
        The current understanding of acausal control (which is a consideration from decision theory, which can in turn be seen as normative element of a meta-ethics, which is the same kind of consideration as “let’s reflect more”) is inadequate to place the weight of the future on a statement like this. We need to think more about decision theory, in particular, before making such decisions.
        
        Suppose further that you end up in control of FAI design
        
        What does it mean? If I can order a computer around, that doesn’t allow me to know what to do with it.
        
        and you want it to take my morality into account. Would you have it extrapolate me using your preferred method, or mine?
        
        I’d think about the problem more, or try implementing a reliable process for that if I can.
  - Blueberry 2 Feb 2011 5:00 UTC
    0 points
    Parent
    
    For example, I have no idea what this means. I don’t know what “extrapolated” means, apart from some vague intuitions, and even what “coherent” means.
    
    It means, for instance, that segments of the population who have different ideas on controversial moral questions like abortion or capital punishment actually have different moralities and different sets of values, and that we as a species will never agree on what answers are right, regardless of how much debate or discussion or additional information we have. I strongly believe this to be true.
    - Vladimir_Nesov 2 Feb 2011 10:48 UTC
      1 point
      Parent
      Clearly, I know all this stuff, so I meant something else. Like not having more precise understanding (that could also easily collapse this surface philosophizing).
      - Blueberry 2 Feb 2011 19:21 UTC
        1 point
        Parent
        Well, yes, I know you know all this stuff. Are you saying we can’t meaningfully discuss it unless we have a precise algorithmic definition of CEV? People’s desires and values are not that precise. I suspect we can only discuss it in vague terms until we come up with some sort of iterative procedure that fits our intuition of what CEV should be, at which point we’ll have to operationally define CEV as that procedure.