timtyler comments on AI Risk and Opportunity: Humanity’s Efforts So Far

timtyler 27 Mar 2012 12:10 UTC
0 points
Evolution normally provides an underlying maximand for organisms: fitness. To maximise their fitnesses, many organisms use their own optimisation tool: a brain—which is essentially a hedonism maximiser. It’s true that sometimes the best thing to do is to use a fast-and-cheap utility function to burn through some task. However, that strategy is normally chosen by a more advanced optimiser for efficiency reasons.

I think a reasonable picture is that creatures act so as to approximate utility maximization as best they can. Utility is an organism’s proxy for its own fitness, and fitness is what really is being maximised.

“Complex reasoning” and “simulating backwards” are legitimate strategies for a utility maximizer to use—if they help to predict how their environment will behave.
- Dmytry 27 Mar 2012 16:28 UTC
  0 points
  Parent
  Well, being a hedonism maximizer leads to wireheading.
  
  I think a reasonable picture is that creatures act so as to approximate utility maximization as best they can. Utility is an organism’s proxy for its own fitness, and fitness is what really is being maximised.
  
  Nah, whenever a worse/flawed approximation to maximization of hedonistic utility results in better fitness, the organisms do that too.
  
  “Complex reasoning” and “simulating backwards” are legitimate strategies for a utility maximizer to use—if they help to predict how their environment will behave.
  
  They don’t help predict, they help to pick actions out of the space of some >10^1000 actions that the agent really has to choose from. That results in apparent preference for some actions, but not others, preference having nothing to do with any utilities and everything to do with action generation. Choosing in the huge action space is hard.
  
  The problem really isn’t so much with utility maximization as model—one could describe anything as utility maximization, in the extreme, defining the utility so that when action matches that chosen by agent, the utility is 1, and 0 otherwise. The problem is when inexperienced, uneducated people start taking the utility maximization too literally and imagining as an ideal some very specific architecture—the one that’s forecasting and comparing utilities that it forecasts—and start modelling the AI behaviour with this idea.
  
  Note, by the way, that such system is not maximizing an utility, but system’s prediction of the utility. Map vs territory confusion here. Maximizing your predicted utility gets you into checkmate when the opponent knows how your necessarily inexact prediction works.
  - timtyler 27 Mar 2012 16:58 UTC
    0 points
    Parent
    
    Well, being a hedonism maximizer leads to wireheading.
    
    Well, that gets complicated. Of course, we can see that there are non-wireheading hedonism maximizers—so there are ways around this problem.
    
    I think a reasonable picture is that creatures act so as to approximate utility maximization as best they can. Utility is an organism’s proxy for its own fitness, and fitness is what really is being maximised.
    
    Nah, whenever a worse/flawed approximation to maximization of hedonistic utility results in better fitness, the organisms do that too.
    
    Well again this is complicated territory. The evolutionary purpose of the brain is to maximise inclusive fitness—and to do that it models its expectations about its fitness and maximises those. If the environment changes in such a way that the brain’s model is inaccurate then evolution could hack in non-brain-based solutions—but ultimately it is going to try and fix the problem by getting the organism’s brain proxy for fitness and evolutionary fitness back into alignment with each other. Usually these two are fairly well aligned—though humans are in a rapidly-changing environment, and so are a bit of an exception to this.
    
    Basically, having organisms fight against their own brains is wasteful—and nature tries to avoid it—by making organisms as harmonious as it can manage.
    
    “Complex reasoning” and “simulating backwards” are legitimate strategies for a utility maximizer to use—if they help to predict how their environment will behave.
    
    They don’t help predict, they help to pick actions out of the space of some >10^1000 actions that the agent really has to choose from.
    
    So, that is the point of predicting! It’s a perfectly conventional way of traversing a search tree to run things backwards and undo—rather than attempt to calculate each prediction from scratch somehow or another. Something like that is not a deviation from a utility maximisation algorithm.
    
    Note, by the way, that such system is not maximizing an utility, but system’s prediction of the utility. Map vs territory confusion here. Maximizing your predicted utility gets you into checkmate when the opponent knows how your necessarily inexact prediction works.
    
    Not necessarily—if you expect to face a vastly more powerful agent, you can sometimes fall-back on non-deterministic algorithms—and avoid being outwitted in this particular way.
    
    Anyway, you have to maximise your expectations of utility (rather than your utility). That isn’t a map vs territory confusion, it’s just the way agents have to work.
    - Dmytry 27 Mar 2012 18:05 UTC
      0 points
      Parent
      
      Well, that gets complicated. Of course, we can see that there are non-wireheading hedonism maximizers—so there are ways around this problem.
      
      Not sure how non-wireheaded though. One doesn’t literally need a wire to the head to have a shortcut. Watching movies or reading fiction is pretty wireheaded effort.
      
      Well again this is complicated territory. The evolutionary purpose of the brain is to maximise inclusive fitness—and to do that it models its expectations about its fitness and maximises those.
      
      I’m not quite sure its how it works. The pain and the pleasure seem to be the reinforcement values for neural network training, rather than actual utilities of any kind. Suppose you are training a dog not to chew stuff, by reinforcements. The reinforcement value is not proportional to utility of behaviour, but is set as to optimize the training process.
      
      So, that is the point of predicting! It’s a perfectly conventional way of traversing a search tree to run things backwards and undo—rather than attempt to calculate each prediction from scratch somehow or another. Something like that is not a deviation from a utility maximisation algorithm.
      
      See, if there’s two actions, one result in utility 1000 and other result in utility 100, this method can choose the one that results in utility 100 because it is reachable by imperfect backwards tracing while the 1000 one isn’t (and is lost in giant space that one can’t search). At that point, you could of course declare that being backwards traceable to is a very desirable property of an action, and goalpost shift so that this action has utility 2000, but its clear that this is a screwy approach.
      
      Anyway, you have to maximise your expectations of utility (rather than your utility). That isn’t a map vs territory confusion, it’s just the way agents have to work.
      
      And how do you improve the models you use for expectations?
      
      Of course you can describe literally anything as ‘utility maximization’, the issue is that agent which is maximizing something, doesn’t really even need to know what it is maximizing, doesn’t necessarily do calculation of utilities, et cetera. You don’t really have model here, you just have a description, and if you are to model it as utility maximizer, you’ll be committing fallacy as with that blue minimizing robot
      - timtyler 27 Mar 2012 19:27 UTC
        0 points
        Parent
        
        The reinforcement value is not proportional to utility of behaviour, but is set as to optimize the training process.
        
        Maybe—if the person rewarding the dog is doing it wrong. Normally, you would want those things to keep reasonably in step.
        
        [...] lost in giant space that one can’t search [...]
        
        So, in the “forecasting/evaluation/tree pruning” framework, that sounds as though it is a consequence of tree pruning.
        
        Pruning is inevitablle in resource-limited agents. I wouldn’t say it stopped them from being expected utility maximisers, though.
        
        how do you improve the models you use for expectations?
        
        a) get more data; b) figure out how to compress it better;
        
        You don’t really have model here, you just have a description, and if you are to model it as utility maximizer, you’ll be committing fallacy as with that blue minimizing robot
        
        Alas, I don’t like that post very much. It is an attack on the concept of utility-maximization, which hardly seems productive to me. Anyway, I think I see your point here—though I am less clear about how it relates to the previous conversation (about expectations of utility vs utility—or more generally about utility maximisation being somehow “sub-optimal”).
        Dmytry 27 Mar 2012 20:44 UTC
        2 points
        Parent
        
        Maybe—if the person rewarding the dog is doing it wrong. Normally, you would want those things to keep reasonably in step.
        
        If the dog chews something really expensive up, there is no point punishing the dog proportionally more for that. That would be wrong; some level of punishment is optimal for training; beyond this is just letting anger out.
        
        Pruning is inevitablle in resource-limited agents. I wouldn’t say it stopped them from being expected utility maximisers, though.
        
        It’s not mere pruning. You need a person to be able to feed your pets, you need them to get through the door, you need a key, you can get a key at key duplicating place, you go to key duplicating place you know of to make a duplicate.
        
        That stops them from being usefully modelled as ‘choose action that gives maximum utility’. You can’t assume that it makes action that results in maximum utility. You can say that it makes action which results in as much utility as this agent with its limitations could get out of that situation, but that’s almost tautological at this point. Also, see
        
        http://en.wikipedia.org/wiki/Intelligent_agent
        
        for terminology.
        
        Anyway, I think I see your point here—though I am less clear about how it relates to the previous conversation (about expectations of utility vs utility—or more generally about utility maximisation being somehow “sub-optimal”).
        
        Well, the utility agent as per wiki article, is clearly stupid because it won’t reason backwards. And the utility maximizers discussed by purely theoretical AI researchers, likewise.
        
        a) get more data; b) figure out how to compress it better;
        
        Needs something better than trying all models and seeing what fits, though. One should ideally be able to use the normal reasoning to improve models. It feels that a better model has bigger utility.
        timtyler 28 Mar 2012 0:16 UTC
        0 points
        Parent
        
        Maybe—if the person rewarding the dog is doing it wrong. Normally, you would want those things to keep reasonably in step.
        
        If the dog chews something really expensive up, there is no point punishing the dog proportionally more for that. That would be wrong; some level of punishment is optimal for training; beyond this is just letting anger out.
        
        You would probably want to let the dog know that some of your chewable things are really expensive. You might also want to tell it about the variance in the value of your chewable items. I’m sure there are some cases where the owner might want to manipulate the dog by giving it misleading reward signals—but honest signals are often best.
        
        Well, the utility agent as per wiki article, is clearly stupid because it won’t reason backwards. And the utility maximizers discussed by purely theoretical AI researchers, likewise.
        
        These are the researchers who presume no computational resource limitation? They have no need to use optimisation heuristics—such as the ones you are proposing—they assume unlimited computing resources.
        
        a) get more data; b) figure out how to compress it better;
        
        Needs something better than trying all models and seeing what fits, though. One should ideally be able to use the normal reasoning to improve models. [...]
        
        Sure. Humans use “normal reasoning” to improve their world models.
        Dmytry 28 Mar 2012 0:31 UTC
        0 points
        Parent
        
        You would probably want to let the dog know that some of your chewable things are really expensive. You might also want to tell it about the variance in the value of your chewable items. I’m sure there are some cases where the owner might want to manipulate the dog by giving it misleading reward signals—but honest signals are often best.
        
        Well, i don’t think that quite works, dogs aren’t terribly clever. Back to humans, e.g. significant injuries hurt a lot less than you’d think they would, my guess is that small self inflicted ones hurt so much for effective conditioning.
        
        These are the researchers who presume no computational resource limitation? They have no need to use optimisation heuristics—such as the ones you are proposing—they assume unlimited computing resources.
        
        The ugly is when some go on and talk about the AGIs certainly killing everyone unless designed in some way that isn’t going to work. And otherwise paint wrong pictures of AGI.
        
        Sure. Humans use “normal reasoning” to improve their world models.
        
        Ya. Sometimes even resulting in breakage, when they modify world models to fit with some pre-existing guess.
        timtyler 28 Mar 2012 0:37 UTC
        0 points
        Parent
        
        significant injuries hurt a lot less than you’d think they would, my guess is that small self inflicted ones hurt so much for effective conditioning.
        
        The idea and its explanation both seem pretty speculative to me.
        Dmytry 28 Mar 2012 0:39 UTC
        0 points
        Parent
        Pavlovian conditioning is settled science; the pain being negative utility value for intelligence etc, not so much.
        timtyler 28 Mar 2012 10:36 UTC
        0 points
        Parent
        The “idea” was:
        
        significant injuries hurt a lot less than you’d think they would
        
        ...and its explanation was:
        
        my guess is that small self inflicted ones hurt so much for effective conditioning.
        
        I’m inclined towards scepticism - significant injuries often hurt a considerable amount—and small ones do not hurt by disproportionally large amounts—at least as far as I know.
        
        There do seem to be some ceiling-llike effects—to try and prevent people passing out and generally going wrong. I don’t think that is to do with your hypothesis.
        Expand this thread
        Dmytry 28 Mar 2012 10:50 UTC
        0 points
        Parent
        The very fact that you can pass out from pain and otherwise the pain interfering with thought and actions, implies that the pain doesn’t work remotely like utility should. Of course one does factor in pain into the utility, but that is potentially dangerous for survival (as you may e.g. have to cut your hand off when its stuck under boulder and you already determined that cutting the hand off is the best means of survival). You can expect interference along the lines of passing out from the network training process. You can’t expect interference from utility values being calculated.
        
        edit:
        
        Okay for the middle ground: would you agree that pain has Pavlovian conditioning role? The brain also assigns it negative utility, but the pain itself isn’t utility, it evolved long before brains could think very well. And in principle you’d be better off assigning utility to lasting damage rather than to pain (and most people do at least try).
        
        edit: that is to say, removing your own appendix got to be easy (for surgeons) if pain was just utility, properly summed with other utilities, making you overall happy that you got the knife for the procedure and can save yourself, through the entire process. It’d be like giving up an item worth $10 for $10 000 000 . There the values are properly summed first, not making you feel the loss and feel the gain separately.
        timtyler 28 Mar 2012 11:19 UTC
        0 points
        Parent
        
        The very fact that you can pass out from pain and otherwise the pain interfering with thought and actions, implies that the pain doesn’t work remotely like utility should.
        
        You don’t think consciousness should be sacrificed—no matter what the degree of damage—in an intelligently designed machine? Nature sacrifices consciousness under a variety of circumstances. Can you defend your intuition about this issue? Why is nature wrong to permit fainting and passing out from excessive pain?
        
        Of course pain should really hurt. It is supposed to distract you and encourage you to deal with it. Creatures in which pain didn’t really, really hurt are likely to have have left fewer descendants.
        Dmytry 28 Mar 2012 12:01 UTC
        0 points
        Parent
        Well, in so much as the intelligence is not distracted and can opt to sit still play dead, there doesn’t seem to be a point in fainting. Any time i have somewhat notable injury (falling off bike, ripping the chin, and getting nasty case of road rash), the pain is less than pain of minor injuries.
        
        Contrast the anticipation of pain with actual pain. Those feel very different. Maybe it is fair to say that the pain is instrumental in creating anticipation of pain, which acts more like utility for intelligent agent. It also serves as a warning signal, and for conditioning, and generally as something that stops you from eating yourself. (and perhaps for telling the intelligence what is and isn’t your body). The pain is supposed to encourage you to deal with the damage, but not to distract you from dealing with the damage.
        timtyler 28 Mar 2012 13:44 UTC
        0 points
        Parent
        
        Well, in so much as the intelligence is not distracted and can opt to sit still play dead, there doesn’t seem to be a point in fainting.
        
        I don’t pretend to know exactly why nature does it—but I expect there’s a reason. It mat be that sometimes being conscious is actively bad. This is one of the reasons for administering anaesthetics—there are cases where a conscious individual in a lot of pain willl ineffectually flail around and get themselves into worse trouble—where they would be better off being quiet and still - “playing dead”.
        
        As to why not “play dead” while remaining conscious—that’s a bit like having two “off” switches. There’s already an off switch. Building a second one that bypasses all the usual responses of the conscious mind while remaining conscious could be expensive. Perhaps not ideal for a rarely-used feature.
        Dmytry 28 Mar 2012 13:50 UTC
        0 points
        Parent
        A lot of the time something is just a side effect. E.g. you select less aggressive foxes, you end up with foxes with floppy ears and white spots on fur.
        
        With regards to flailing around that strikes me as more of reflex than utility driven behaviour. For playing dead, I mean, I can sit still when having teeth done without anaesthesia.
        
        The problem with just fainting is that it is reflex—when conditions, do faint, when other conditions, flail around—not a proper utility maximizing agent behaviour—what are the consequences to flailing around, what are the consequences of sitting still, choose the one that has better consequences.
        
        It seems to me that originally the pain was to train the neural network not to eat yourself, then it got re-used for other stuff, that it is not very suitable for.
        timtyler 28 Mar 2012 17:45 UTC
        0 points
        Parent
        
        The problem with just fainting is that it is reflex—when conditions, do faint, when other conditions, flail around—not a proper utility maximizing agent behaviour—what are the consequences to flailing around, what are the consequences of sitting still, choose the one that has better consequences.
        
        Well, it’s a consequence of resource limitation. A supercomputer with moment-by-moment control over actions might never faint. However, when there’s a limited behavioural repertoire, with less precise control over what action to take—and a limited space in which to associate sensory stimulii and actions, occasionally fainting could become a more reasonable course of action.
        
        It seems to me that originally the pain was to train the neural network not to eat yourself, then it got re-used for other stuff, that it is not very suitable for.
        
        The pleasure-pain axis is basically much the same idea as a utility value—or perhaps the first derivative of a utility. The signal might be modulated by other systems a little, but that’s the essential nature of it.
        Dmytry 28 Mar 2012 18:07 UTC
        0 points
        Parent
        Then why the anticipated pain feels so different from actual ongoing pain?
        
        Also, I think it’s more of a consequence of resource limitations of a worm or a fish. We don’t have such severe limitations.
        
        Other issue: consider 10 hours of harmless but intense pain vs perfectly painless lobotomy. I think most of us would try harder to avoid the latter than the former, and would prefer the pain. edit: furthermore, we could consciously and wilfully take a painkiller, but not lobotomy-fear-neutralizer.
        timtyler 28 Mar 2012 20:42 UTC
        0 points
        Parent
        
        Then why the anticipated pain feels so different from actual ongoing pain?
        
        I’m not sure I understand why they should be similar. Anticipated pain may never happen. Combining anticipated pain with actual pain probably doesn’t happen more because that would “muddy” the reward signal. You want a “clear” reward signal to facilitate attributing the reward to the actions that led to it. Too much “smearing” of reward signals out over time doesn’t help with that.
        
        I think it’s more of a consequence of resource limitations of a worm or a fish. We don’t have such severe limitations.
        
        Maybe—though that’s probably not an easy hypothesis to test.
        Dmytry 28 Mar 2012 21:42 UTC
        0 points
        Parent
        Well, the utility as in ‘utility function’ of an utility maximizing agent, is something that’s calculated in predicted future state. The pain is only calculated in the now. That’s a subtle distinction.
        
        I think this lobotomy example (provided that subject knows what lobotomy is and what brain is and thus doesn’t want lobotomy) clarifies why I don’t think pain is working quite like an utility function’s output. The fear does work like proper utility function’s output. When you fear something you also don’t want to get rid of that fear (with some exceptions in the people who basically don’t fear correctly). And fear is all about future state.
        timtyler 28 Mar 2012 22:32 UTC
        0 points
        Parent
        
        Well, the utility as in ‘utility function’ of an utility maximizing agent, is something that’s calculated in predicted future state. The pain is only calculated in the now. That’s a subtle distinction.
        
        It’s better to think of future utility being an extrapolation of current utility—and current utility being basically the same thing as the position of the pleasure-pain axis. Otherwise there is a danger of pointlessly duplicating concepts.
        
        It is the cause of lots of problems to distinguish too much between utility and pleasure. The pleasure-pain axis is nature’s attempt to engineer a utility-based system. It did a pretty good job.
        
        Of course, you should not take “pain” too literally—in the case of humans. Humans have modulations on pain that feed into their decision circuitry—but the result is still eventually collapsed down into one dimension—like a utility value.
        Dmytry 29 Mar 2012 7:50 UTC
        0 points
        Parent
        
        It’s better to think of future utility being an extrapolation of current utility—and current utility being basically the same thing as the position of the pleasure-pain axis. Otherwise there is a danger of pointlessly duplicating concepts.
        
        The danger here is in inventing terminology that is at odds with normally used terminology, resulting in confusion when reading texts written in standard terminology. I would rather describe human behaviour as ‘learning agent’ as per Russell & Norvig 2003 . where the pain is part of ‘critic’. You can see a diagram on wikipedia:
        
        http://en.wikipedia.org/wiki/Intelligent_agent
        
        Ultimately, the overly broad definitions become useless.
        
        We also have a bit of ‘reflex agent’ where pain makes you flinch away or flail around or faint (though i’d dare a guess that most people don’t faint even when pain has saturated and can’t increase any further).