NancyLebovitz comments on Stupid Questions Open Thread

NancyLebovitz 30 Dec 2011 14:33 UTC
3 points
0
Is there a proof that it’s possible to prove Friendliness?
- Vladimir_Nesov 30 Dec 2011 17:02 UTC
  9 points
  0
  Parent
  No. There’s also no proof that it’s possible to prove that P!=NP, and for the Friendliness problem it’s much, much less clear what the problem even means. You aren’t entitled to that particular proof, it’s not expected to be available until it’s not needed anymore. (Many difficult problems get solved or almost solved without a proof of them being solvable appearing in the interim.)
  - NancyLebovitz 30 Dec 2011 17:43 UTC
    0 points
    0
    Parent
    Why is it plausible that Friendliness is provable? Or is it more a matter that the problem is so important that it’s worth trying regardless?
    - Vladimir_Nesov 30 Dec 2011 18:54 UTC
      10 points
      0
      Parent
      There is no clearly defined or motivated problem of “proving Friendliness”. We need to understand what goals are, what humane goals are, what process can be used to access their formal definition, and what kinds of things can be done with them how to what end. We need to understand these things well, which (on psychological level) triggers association with mathematical proofs, and will probably actually involve some mathematics suitable to the task. Whether the answers take the form of something describable as “provable Friendliness” seems to me an unclear/unmotivated consideration. Unpacking that label might make it possible to provide a more useful response to the question.
- XiXiDu 30 Dec 2011 15:05 UTC
  3 points
  0
  Parent
  
  Is there a proof that it’s possible to prove Friendliness?
  
  I wonder what SI would do next if they could prove that friendly AI was not possible. For example if it could be shown that value drift was inevitable and that utility-functions are unstable under recursive self-improvement.
  - TimS 30 Dec 2011 15:12 UTC
    −3 points
    0
    Parent
    
    Something along the lines that value drift is inevitable and utility-functions are unstable under recursive self-improvement.
    
    That doesn’t seem like the only circumstances in which FAI is not possible. If moral nihilism is true, then FAI is impossible even if value drift is not inevitable.
    In that circumstance, shouldn’t we try to make any AI we decide to build “friendly” to present day humanity, even if it wouldn’t be friendly to Aristotle or Plato or Confucius. Based on hidden complexity of wishes analysis, consistency with our current norms is still plenty hard.
    - NancyLebovitz 30 Dec 2011 16:38 UTC
      0 points
      0
      Parent
      My concerns are more that it will not be possible to adequately define “human”, especially as, transhuman tech develops, and that there might not be a good enough way to define what’s good for people.
      - Shmi 30 Dec 2011 20:54 UTC
        0 points
        0
        Parent
        As I understand it, the modest goal of building an FAI is that of giving an AGI a push in the “right” direction, what EY refers to as the initial dynamics. After that, all bets are off.