Dmytry comments on AI Risk and Opportunity: Humanity’s Efforts So Far

Dmytry 28 Mar 2012 13:50 UTC
0 points
0
A lot of the time something is just a side effect. E.g. you select less aggressive foxes, you end up with foxes with floppy ears and white spots on fur.

With regards to flailing around that strikes me as more of reflex than utility driven behaviour. For playing dead, I mean, I can sit still when having teeth done without anaesthesia.

The problem with just fainting is that it is reflex—when conditions, do faint, when other conditions, flail around—not a proper utility maximizing agent behaviour—what are the consequences to flailing around, what are the consequences of sitting still, choose the one that has better consequences.

It seems to me that originally the pain was to train the neural network not to eat yourself, then it got re-used for other stuff, that it is not very suitable for.
- timtyler 28 Mar 2012 17:45 UTC
  0 points
  0
  Parent
  
  The problem with just fainting is that it is reflex—when conditions, do faint, when other conditions, flail around—not a proper utility maximizing agent behaviour—what are the consequences to flailing around, what are the consequences of sitting still, choose the one that has better consequences.
  
  Well, it’s a consequence of resource limitation. A supercomputer with moment-by-moment control over actions might never faint. However, when there’s a limited behavioural repertoire, with less precise control over what action to take—and a limited space in which to associate sensory stimulii and actions, occasionally fainting could become a more reasonable course of action.
  
  It seems to me that originally the pain was to train the neural network not to eat yourself, then it got re-used for other stuff, that it is not very suitable for.
  
  The pleasure-pain axis is basically much the same idea as a utility value—or perhaps the first derivative of a utility. The signal might be modulated by other systems a little, but that’s the essential nature of it.
  - Dmytry 28 Mar 2012 18:07 UTC
    0 points
    0
    Parent
    Then why the anticipated pain feels so different from actual ongoing pain?
    
    Also, I think it’s more of a consequence of resource limitations of a worm or a fish. We don’t have such severe limitations.
    
    Other issue: consider 10 hours of harmless but intense pain vs perfectly painless lobotomy. I think most of us would try harder to avoid the latter than the former, and would prefer the pain. edit: furthermore, we could consciously and wilfully take a painkiller, but not lobotomy-fear-neutralizer.
    - timtyler 28 Mar 2012 20:42 UTC
      0 points
      0
      Parent
      
      Then why the anticipated pain feels so different from actual ongoing pain?
      
      I’m not sure I understand why they should be similar. Anticipated pain may never happen. Combining anticipated pain with actual pain probably doesn’t happen more because that would “muddy” the reward signal. You want a “clear” reward signal to facilitate attributing the reward to the actions that led to it. Too much “smearing” of reward signals out over time doesn’t help with that.
      
      I think it’s more of a consequence of resource limitations of a worm or a fish. We don’t have such severe limitations.
      
      Maybe—though that’s probably not an easy hypothesis to test.
      - Dmytry 28 Mar 2012 21:42 UTC
        0 points
        0
        Parent
        Well, the utility as in ‘utility function’ of an utility maximizing agent, is something that’s calculated in predicted future state. The pain is only calculated in the now. That’s a subtle distinction.
        
        I think this lobotomy example (provided that subject knows what lobotomy is and what brain is and thus doesn’t want lobotomy) clarifies why I don’t think pain is working quite like an utility function’s output. The fear does work like proper utility function’s output. When you fear something you also don’t want to get rid of that fear (with some exceptions in the people who basically don’t fear correctly). And fear is all about future state.
        timtyler 28 Mar 2012 22:32 UTC
        0 points
        0
        Parent
        
        Well, the utility as in ‘utility function’ of an utility maximizing agent, is something that’s calculated in predicted future state. The pain is only calculated in the now. That’s a subtle distinction.
        
        It’s better to think of future utility being an extrapolation of current utility—and current utility being basically the same thing as the position of the pleasure-pain axis. Otherwise there is a danger of pointlessly duplicating concepts.
        
        It is the cause of lots of problems to distinguish too much between utility and pleasure. The pleasure-pain axis is nature’s attempt to engineer a utility-based system. It did a pretty good job.
        
        Of course, you should not take “pain” too literally—in the case of humans. Humans have modulations on pain that feed into their decision circuitry—but the result is still eventually collapsed down into one dimension—like a utility value.
        Dmytry 29 Mar 2012 7:50 UTC
        0 points
        0
        Parent
        
        It’s better to think of future utility being an extrapolation of current utility—and current utility being basically the same thing as the position of the pleasure-pain axis. Otherwise there is a danger of pointlessly duplicating concepts.
        
        The danger here is in inventing terminology that is at odds with normally used terminology, resulting in confusion when reading texts written in standard terminology. I would rather describe human behaviour as ‘learning agent’ as per Russell & Norvig 2003 . where the pain is part of ‘critic’. You can see a diagram on wikipedia:
        
        http://en.wikipedia.org/wiki/Intelligent_agent
        
        Ultimately, the overly broad definitions become useless.
        
        We also have a bit of ‘reflex agent’ where pain makes you flinch away or flail around or faint (though i’d dare a guess that most people don’t faint even when pain has saturated and can’t increase any further).