TheAncientGeek comments on Learning values versus learning knowledge

TheAncientGeek 19 Sep 2016 13:02 UTC
−2 points

We don’t know how to program a foolproof method of “filling in the gaps” (and a lot of “filling in the gaps” would be a creative process rather that a mere learning one, such as figuring out how to extend natural language concepts to new areas).

Inasmuch as that is relying on the word “foolproof”, it is proving much too much., since we barely have foolproof methods to do anything.

The thing is that your case needs to be argued from consistent and fair premises..where “fair” means that your opponents are allowed to use them.

If you are assuming that an AI has sufficiently advanced linguistic abilities to talk its way out of a box, then your opponents are entitled to assume that the same level of ability could be applied to understanding verbally specified goals.

If you are assuming that it is limitation of ability that is preventing the AI from understanding what “chocolate” means, then your opponents are entitled to assume it is weak enough to be boxable.

And it helps it people speak about this problem in terms of coding, rather than high level concepts, because all the specific examples people have ever come up with for coding learning, have had these kind of flaws.

What specific examples? Loosemore’s counterargument is in terms of coding. And I notice you don’t avoid NL arguments yourself.

Coding learning with some imperfections might be ok if the AI is motivated to merely learn, but is positively pernicious if the AI has other motivations as to what to do with that learning (see my post here for a way of getting around it: https://agentfoundations.org/item?id=947 )

I rather doubt that the combination of a learning goal, plus some other goal, plus imperfect ability is all that deadly, since we already have AI that are like that, and which haven’t killed us. I think you must be making some other assumptions, for instance that the AI is in some sort of “God” role, with an open-ended remit to improve human life.
- Stuart_Armstrong 19 Sep 2016 13:28 UTC
  2 points
  Parent
  
  If you are assuming that an AI has sufficiently advanced linguistic abilities to talk its way out of a box, then your opponents are entitled to assume that the same level of ability could be applied to understanding verbally specified goals.
  
  They are entitled to assume they could be applied, not necessarily that they would be. At some point, there’s going to have to be something that tells the AI to, in effect, “use the knowledge and definitions in your knowledge base to honestly do X [X = some NL objective]”. This gap may be easy to bridge, or hard; no-one’s suggested any way of bridging it so far.
  
  It might be possible; it might be trivial. But there’s no evidence in that direction so far, and the designs that people have actually proposed have been disastrous. I’ll work at bridging this gap, and see if I can solve it to some level of approximation.
  
  And I notice you don’t avoid NL arguments yourself.
  
  Yes, which is why I’m stepping away from those argument to help bring clarity.
  - TheAncientGeek 19 Sep 2016 16:58 UTC
    1 point
    Parent
    
    They are entitled to assume they could be applied, not necessarily that they would be. At some point, there’s going to have to be something that tells the AI to, in effect, “use the knowledge and definitions in your knowledge base to honestly do X [X = some NL objective]”. This gap may be easy to bridge, or hard; no-one’s suggested any way of bridging it so far.
    
    There’s only a gap if you start from the assumption that a compartmentalised UF is in some way easy, natural or preferable. However, your side of the debate has never shown that.
    
    At some point, there’s going to have to be something that tells the AI to, in effect, “use the knowledge and definitions in your knowledge base to honestly do X [X = some NL objective]”.
    
    No...you don’t have to show a fan how to make a whirring sound… use of updatable knowledge to specify goals is a natural consequence of some designs.
    
    It might be possible; it might be trivial.
    
    You are assuming it is difficult, with little evidence.
    
    But there’s no evidence in that direction so far, and the designs that people have actually proposed have been disastrous.
    
    Designs that bridge a gap, or designs that intrinsically don’t have one?
    
    I’ll work at bridging this gap, and see if I can solve it to some level of approximation.
    
    Why not examine the assumption that there has to be a gap?
    - Stuart_Armstrong 19 Sep 2016 18:03 UTC
      2 points
      Parent
      
      There’s only a gap if you start from the assumption that a compartmentalised UF is in some way easy, natural or preferable.
      
      ? Of course there’s a gap. The AI doesn’t start with full NL understanding. So we have to write the AI’s goals before the AI understands what the symbols mean.
      
      Even if the AI started with full NL understanding, we still would have to somehow program it to follow our NL instructions. And we can’t do that initial programming using NL, of course.
      - TheAncientGeek 22 Sep 2016 17:03 UTC
        0 points
        Parent
        
        Of course there’s a gap. The AI doesn’t start with full NL understanding.
        
        Since you are talking in terms of a general counterargument, I don;t think you can appeal to a specific architecture.
        
        So we have to write the AI’s goals before the AI understands what the symbols mean.
        
        Which would be a problem if it designed to attempt to execute NL instructions without checking if it understands them...which is a bit clown car-ish. An AI that is capable of learning NL as it goes along is an AI that has gernal a goal to get language right. Why assume it would not care about one specific sentence?
        
        Even if the AI started with full NL understanding, we still would have to somehow program it to follow our NL instructions
        
        Y-e-es? Why assume “it needs to follow instructions” equates to “it would simplify the instructions it’s following” rather than something else?