Eliezer Yudkowsky comments on Bay area LW meet-up

Eliezer Yudkowsky Nov 11, 2009, 6:02 AM
5 points

Do you disagree with her?

Nope. Specifying goal systems is FAI work, not AI work.

So then simpler utility functions will be easier to code and easier to prove correct.

Relative to ancient Greece, building a .45 caliber semiautomatic pistol isn’t much harder than building a .22 caliber semiautomatic pistol. You might think the weaker weapon would be less work, but most of the problem doesn’t scale all that much with the weapon strength.
- John_Maxwell Nov 11, 2009, 6:20 AM
  4 points
  Parent
  OK, so you’re saying that FAI is not hard because you have to formalize human morality, it’s hard because you have to have a system for formalizing things in general?
  
  I’m tempted to ask why you’re so confident on this subject, but this debate probably isn’t worth having because once you’re at the point where you can formalize things, the relative difficulty of formalizing different utility functions will presumably be obvious.
  - Eliezer Yudkowsky 11 Nov 2009 6:30 UTC
    2 points
    Parent
    
    OK, so you’re saying that FAI is not hard because you have to formalize human morality, it’s hard because you have to have a system for formalizing things in general?
    
    Pretty much. Thanks for compactifying. “Rigorously communicating” might be a better term than “formalizing”, “formalizing” has been tainted by academics showing off.
  - Vladimir_Nesov 11 Nov 2009 14:29 UTC
    0 points
    Parent
    
    OK, so you’re saying that FAI is not hard because you have to formalize human morality, it’s hard because you have to have a system for formalizing things in general?
    
    This also seems to be the only way out. If human values are too complex to reimplement manually (which seems to be the case), you have to create a tool with the capability to do that automatically. And once you have that tool, cutting angles on the content of human values would just be useless: the tool will work on the whole thing. And you can’t cut corners on the tool itself, like you can’t have a computer with only randomly sampled 50% of circuitry.
    - Nick_Tarleton 12 Nov 2009 2:20 UTC
      1 point
      Parent
      
      If human values are too complex to reimplement manually (which seems to be the case), you have to create a tool with the capability to do that automatically. And once you have that tool
      
      You’re right, of course, but the point at hand is what to do before you have that tool.
      - Vladimir_Nesov 12 Nov 2009 2:29 UTC
        0 points
        Parent
        Work towards developing it?
    - John_Maxwell 12 Nov 2009 2:14 UTC
      1 point
      Parent
      
      If human values are too complex to reimplement manually (which seems to be the case), you have to create a tool with the capability to do that automatically.
      
      You can’t prove it works before running it in that case. Human values are not some kind of fractal pattern, where something complicated can be generated according to simple rules. In your proposal, the AI would have to learn human values somehow, which means it will have some indicator or another that it’s getting closer to human values (e.g. smiling humans), which will then be susceptible to wire-heading. Having the AI make inferences from a large corpus of human writing might work.