Wei Dai comments on Building toward a Friendly AI team

Wei Dai 7 Jun 2012 9:09 UTC
32 points
Some comments on the recruiting plan:
1. I think a highly rational person would have high moral uncertainty at this point and not necessarily be described as “altruistic”. For example I consider Eliezer’s apparent high certainty in utilitarianism (assuming it’s not just a front for PR purposes) as evidence against his rationality. Given a choice between a more altruistic candidate and a more rational candidate, I think SI ought to choose the latter.
2. Similarly for “deeply committed to AI risk reduction”. I think a highly rational person would think that working on AI risk reduction is probably the best thing to do at this point but would be pretty uncertain about this and be ready to change their mind if new evidence or theories come along.
3. What does “trustworthy” mean, apart from “rationality”? Something like psychological stability?
4. It seems like the plan is to have one Eliezer-type (philosophy oriented) person in the team with the rest being math focused. I don’t understand why it isn’t more like half and half, or aiming for a balance of skills in all recruits. If there is only one philosophy oriented person in the team, how will the others catch his mistakes? If the reason is that you don’t expect to be able to recruit more than one Eliezer-type (of sufficient skill), then I think that’s enough reason to not build an FAI team.
5. I think a highly desirable trait in an FAI team member is having a strong suspicion that flaws lurk in every idea. This seems to work better in motivating one to try to find flaws than just “having something to protect”.
- Will_Newsome 7 Jun 2012 19:38 UTC
  10 points
  Parent
  Regarding 5, I would think an important subskill would be recognizing arbitrarity in conceptual distinctions, e.g. between belief and preference, agent and environment, computation and context, ethics and meta-ethics, et cetera. Relatedly, not taking existing conceptual frameworks and their distinctions as word of God. Word of von Neumann is a lot like word of God but still not quite.
  
  By the way I love comments like yours here that emphasize moral uncertainty.
- steven0461 7 Jun 2012 20:04 UTC
  7 points
  Parent
  
  I think a highly rational person would have high moral uncertainty at this point and not necessarily be described as “altruistic”.
  
  Do you think the correct level of moral uncertainty would place so much probability on egoism-like hypotheses that the behavior it outputs, even after taking into account various game-theoretical concerns about cooperation as well as the surprisingly large apparent asymmetry between the size of altruistic returns available vs. the size of egoistic returns available, doesn’t end up behaving substantially more altruistically than a typical human or a typical math genius is likely to behave? It seems implausible to me, but I’m not that confident, and as I’ve been saying earlier, the topic is weirdly neglected here for one with such high import.
  
  Given a choice between a more altruistic candidate and a more rational candidate, I think SI ought to choose the latter.
  
  Surely it depends on how much more altruistic and how much more rational.
  - Wei Dai 7 Jun 2012 21:55 UTC
    8 points
    Parent
    
    various game-theoretical concerns about cooperation
    
    Most people have some pre-theoretic intuitions about cooperation, which game theory may merely formalize. It’s not clear to me that familiarity with such theoretical concerns implies one ought to be more “altruistic” than average.
    
    the surprisingly large apparent asymmetry between the size of altruistic returns available vs. the size of egoistic returns available
    
    If someone is altruistic because they’ve maxed out their own egoistic values (or has gotten to severely diminishing returns), I certainly wouldn’t count that against their rationality. But if “egoistic returns” include abstract values that the rest of humanity doesn’t necessarily share, “large apparent asymmetry” is unclear to me.
    
    as I’ve been saying earlier, the topic is weirdly neglected here for one with such high import
    
    Where did you say that? (I wrote Shut Up and Divide? which may or may not be relevant depending on what you mean by “the topic”.)
    
    Surely it depends on how much more altruistic and how much more rational.
    
    Why “surely”, given that I’m not a random member of humanity, and may have more values in common with a less altruistic candidate than a more altruistic candidate?
    - steven0461 8 Jun 2012 0:02 UTC
      3 points
      Parent
      
      If someone is altruistic because they’ve maxed out their own egoistic values (or has gotten to severely diminishing returns), I certainly wouldn’t count that against their rationality. But if “egoistic returns” include abstract values that the rest of humanity doesn’t necessarily share, “large apparent asymmetry” is unclear to me.
      
      I just meant that it seems to be possible to improve a lot of other people’s expected quality of life at the expense of relatively small decreases to one’s own (but that people are generally not doing so), and that this seems like it should cause the outcome of a process with moral uncertainty between egoism and altruism to skew more toward the altruist side in some sense, though I don’t understand how to deal with moral uncertainty (if anyone else does, I’d be interested in your answers to this). If by “abstract values” you mean something like making the universe as simple as possible by setting all the bits to zero, then I agree there’s no asymmetry, but I wouldn’t call that “egoistic” as such.
      
      Where did you say that? (I wrote Shut Up and Divide? which may or may not be relevant depending on what you mean by “the topic”.)
      
      Here. Yes, SUAD was a good and relevant contribution.
      
      Why “surely”, given that I’m not a random member of humanity, and may have more values in common with a less altruistic candidate than a more altruistic candidate?
      
      You’re right that it’s not certain that altruism in a FAI team candidate is, all else equal, more desirable. I guess I’m just saying that if it is, then sufficiently large differences in altruism outweigh sufficiently small differences in rationality.
      - Wei Dai 12 Jun 2012 13:02 UTC
        3 points
        Parent
        I have written a few more posts that are relevant to the “egoism vs altruism” question:
        
        http://lesswrong.com/lw/8gk/where_do_selfish_values_come_from/
        http://lesswrong.com/lw/6ta/what_if_sympathy_depends_on_anthropomorphizing/
        http://lesswrong.com/lw/2b7/hacking_the_cev_for_fun_and_profit/
        http://lesswrong.com/lw/1mo/the_preference_utilitarians_time_inconsistency/
        
        I guess we don’t have more discussions of altruism vs egoism because making progress on the problem is hard. Typical debates about moral philosophy are not very productive, and it’s probably fortunate that LW is good at avoiding them.
        
        Do you agree? Do you think there are good arguments to be had that we’re not having for some reason? Does it seem to you that most LWers are just not very interested in the problem?
- wedrifid 7 Jun 2012 9:31 UTC
  5 points
  Parent
  
  It seems like the plan is to have one Eliezer-type (philosophy oriented) person in the team with the rest being math focused. I don’t understand why it isn’t more like half and half, or aiming for a balance of skills in all recruits. If there is only one philosophy oriented person in the team, how will the others catch his mistakes? If the reason is that you don’t expect to be able to recruit more than one Eliezer-type (of sufficient skill), then I think that’s enough reason to not build an FAI team.
  
  From what I understand from past utterances, core SingInst folks tend to extend their “elite math” obsession to very nearly equating it with capability for philosophy.
  - Wei Dai 7 Jun 2012 16:10 UTC
    3 points
    Parent
    Can you give some examples of such utterances?
    - reup 7 Jun 2012 20:32 UTC
      5 points
      Parent
      One somewhat close quote that popped to mind (from lukeprog’s article on philosophy):
      
      Second, if you want to contribute to cutting-edge problems, even ones that seem philosophical, it’s far more productive to study math and science than it is to study philosophy. You’ll learn more in math and science, and your learning will be of a higher quality.
      - Wei Dai 12 Jun 2012 13:50 UTC
        8 points
        Parent
        My view is that if you take someone with philosophical talents and interests (presumably inherited or caused by the environment in a hard-to-control manner) , you can make a better philosopher out of them by having them study more math and science than the typical education for a philosopher. But if you take someone with little philosophical talent and interest and do the same, they’ll just become mathematicians and scientists.
        
        I think this is probably similar to the views of SIAI people, and your quote doesn’t contradict my understanding.
        Will_Newsome 25 Jun 2012 12:44 UTC
        2 points
        Parent
        Do you have ideas about how to find philosophical talent, especially the kind relevant for Friendliness philosophy? I don’t think SingInst folk have worked very thoroughly on the problem, but someone might have. Geoff Anders has spent a lot of time thinking about the problem and he runs summer programs teaching philosophy. Dunno how much progress he’s made. (Um, for whatever it’s worth, he seems to think I have philosophical aptitude—modus ponens or modus tollens, take your pick.)
      - private_messaging 8 Jun 2012 13:21 UTC
        1 point
        Parent
        Unfortunately none of core singinst guys seem to have any interesting accomplishments in math or have actually studied that math in depth; it is a very insightful remark by Luke but it’d be great if they have applied it to themselves; otherwise it just looks like Dunning-Kruger effect. I don’t see any reason to think that the elite math references are anything but lame signaling that is usually done by those whom don’t know math enough to properly signal the knowledge (by actually doing something new in math). Sadly it works: if you use jargon and you say something like what Luke said, then some of the people whom can’t independently evaluate your math skills, will assume it must be very high. Meanwhile I will assume it to be rather low because those with genuinely high skill will signal such skill in different way.
    - wedrifid 7 Jun 2012 22:42 UTC
      0 points
      Parent
      
      Can you give some examples of such utterances?
      
      The most recent example is hard to give—it was in person from Anna. Other examples I would have to search through Eliezer’s comments from years back to find.
- somervta 8 Sep 2012 1:37 UTC
  4 points
  Parent
  I think “trustworthy” here means something along the lines of “committed to the organization/project”, in the sense that they’re not going to take the ideas/code used in SI conversations and ventures to Google or some other project. In other words, they’re not going to be bribed away.
- lukeprog 26 Dec 2012 23:04 UTC
  2 points
  Parent
  Thanks for this. I’m writing a followup to this post that incorporates the points you’ve raised here.