dbaupp comments on What a practical plan for Friendly AI looks like

dbaupp 20 Aug 2011 14:31 UTC
4 points

the practical FAI project would nonetheless choose the AI’s goal system in the old-fashioned way, by human deliberation and consensus

[...]

Our practical FAI project has “solved” FAI by simply coming to an agreement on what to wish for, and by studying with legalistic care how to avoid pitfalls and loopholes in the finer details of the wish

If we are making an AGI, then humans think too slowly, in comparison, to be able to completely consider every single possible aspect of a “wish”, so I don’t think legalistic is strong enough, given the large negative utility of a mistake. A mathematical proof of Friendliness should be required, and that is what the formalisations of “hopelessly impractical models of cognition” (e.g. TDT) are a step towards.

If your goal is AGI, then you want a cognitive architecture that will exhibit these behaviors.

FTFY. If you are designing something to have behaviour x then you want behaviour x to definitely occur, possibly being built out of other behaviours, but not just “emerging” out of other behaviours.
- lessdazed 20 Aug 2011 21:44 UTC
  3 points
  Parent
  
  If you are designing something to have behaviour x then you want behaviour x to definitely occur, possibly being built out of other behaviours, but not just “emerging” out of other behaviours.
  
  I think the problem with the proposal is the opposite of what I think you think it is.
  
  Omohundro’s universal AI instrumental values are things that, if absent in the final product, mean that you have failed. Their presence means little because one could simply design for them.
  
  It’s not that we want these behaviors to occur; if we don’t know how they do then “emerging” or “arising in a way I do not understand” are fine phrases to use. If you don’t understand how they arise from the sub-units that you’ve carefully built, you’re probably, but not certainly, in a lot of trouble. If you try too hard to design the unit to do these behaviors directly, you’re hacking together a solution and are almost certainly failing, basically certainly failing less “so you’re saying there’s a chance”.
  - dbaupp 21 Aug 2011 2:01 UTC
    0 points
    Parent
    
    If you don’t understand how they arise from the sub-units that you’ve carefully built, you’re probably, but not certainly, in a lot of trouble.
    
    That’s what I was trying to say, thanks :)