Wei Dai comments on The Urgent Meta-Ethics of Friendly Artificial Intelligence

Wei Dai 3 Feb 2011 20:04 UTC
6 points
Suppose you have the intuition that extended reflection and coherence are good heuristics to guide your extrapolation. I, on the other hand, think that extended reflection as a base human is dangerous, and coherence has nothing to do with what’s right. I’d rather that the extrapolated me experiment with self-modification after only a moderate amount of theorizing, and at the end merge with its counter-factual versions through acausal negotiation.

Suppose further that you end up in control of FAI design, and you want it to take my morality into account. Would you have it extrapolate me using your preferred method, or mine?
- Vladimir_Nesov 4 Feb 2011 14:15 UTC
  4 points
  Parent
  
  Suppose you have the intuition that extended reflection and coherence are good heuristics to guide your extrapolation. I, on the other hand, think that extended reflection as a base human is dangerous, and coherence has nothing to do with what’s right.
  
  What these heuristics discuss are ways of using more resources. The resources themselves are heuristically assumed to be useful, and so we discuss how to use them best.
  
  (Now to slip to an object-level argument for a change.)
  
  Notice the “especially if you’re given opportunity to decide how the process is to be set up” in my comment. I agree that unnaturaly extended reflection is dangerous, we might even run into physiological problems with computations in the brains that are too chronologically old. But 50 years is better that 6 months, even if both 50 years and 6 months are dangerous. And if you actually work on planning these reflection sessions, so that you can set up groups of humans to work for some time, then maybe resetting them and only having them pass their writings to new humans, filtering such findings using not-older-than-50 humans trained on more and more improved findings and so on. For most points you could raise with the reason it’s dangerous, we could work on finding a solution for that problem. For any experiment with FAI design, we would be better off thinking about it first.
  
  Likewise, if you task 1000 groups of humans to work on coming up with possible strategies for using the next batch of computational resources (not for doing most good explicitly, but for developing even better heuristic understanding of the problem), and you use the model of human research groups as having a risk of falling into reflective death spirals where all members of a group can fall to memetic infection that gives no answers to the question they considered, then it seems like a good heuristic to place considerably less weight on suggestions that come up very rarely and don’t get supported by some additional vetting process.
  
  For example, the first batches of research could focus on developing effective training programs in rationality, then in social engineering, voting schemes, and so on. Overall architecture of future human-level meta-ethics necessary for more dramatic self-improvement (or improvement in the methods of having things done, such as using a non-human AI or science of deep non-human moral calculations) would come much later.
  
  In short, I’m not talking of anything that opposes the strategies you named, so you’d need to point to incurable problems that make the strategy of thinking more about the problem lead to worse results than randomly making stuff up (sorry!).
  
  and at the end merge with its counter-factual versions through acausal negotiation.
  
  The current understanding of acausal control (which is a consideration from decision theory, which can in turn be seen as normative element of a meta-ethics, which is the same kind of consideration as “let’s reflect more”) is inadequate to place the weight of the future on a statement like this. We need to think more about decision theory, in particular, before making such decisions.
  
  Suppose further that you end up in control of FAI design
  
  What does it mean? If I can order a computer around, that doesn’t allow me to know what to do with it.
  
  and you want it to take my morality into account. Would you have it extrapolate me using your preferred method, or mine?
  
  I’d think about the problem more, or try implementing a reliable process for that if I can.