Leonhart comments on Open Thread for February 3 − 10

Leonhart 4 Feb 2014 10:12 UTC
3 points

There is an obvious comparison to porn here, even though you disclaim ‘not catgirls’.

You’re aware that ‘catgirls’ is local jargon for “non-conscious facsimiles” and therefore the concern here is orthogonal to porn?

Optimization should be for a healthy relationship, not for ‘satisfaction’ of either party (see CelestAI in Friendship is Optimal for an example of how not to do this)

If you don’t mind, please elaborate on what part of “healthy relationship” you think can’t be cashed out in preference satisfaction (including meta-preferences, of course). I have defended the FiO relationship model elsewhere; note that it exists in a setting where X-risk is either impossible or has already completely happened (depending on your viewpoint) so your appeal to it below doesn’t apply.

Such a relationship should occupy the amount of time needed to help both parties mature, no less and no more.

Valuable relationships don’t have to be goal-directed or involve learning. Do you not value that-which-I’d-characterise-as ‘comfortable companionship’?
- savageorange 4 Feb 2014 13:07 UTC
  2 points
  Parent
  
  You’re aware that ‘catgirls’ is local jargon for “non-conscious facsimiles” and therefore the concern here is orthogonal to porn?
  
  Oops, had forgotten that, thanks. I don’t agree that catgirls in that sense are orthogonal to porn, though. At all.
  
  If you don’t mind, please elaborate on what part of “healthy relationship” you think can’t be cashed out in preference satisfaction
  
  No part, but you can’t merely ‘satisfy preferences’.. you have to also not-satisfy preferences that have a stagnating effect. Or IOW, a healthy relationship is made up of satisfaction of some preferences, and dissatisfaction of others -- for example, humans have an unhealthy, unrealistic, and excessive desire for certaintly. This is the problem with CelestAI I’m pointing to, not all your preferences are good for you, and you (anybody) probably aren’t mentallly rigorous enough that you even have a preference ordering over all sets of preference conflicts that come up. There’s one particular character that likes fucking and killing.. and drinking.. and that’s basically his main preferences. CelestAI satisfies those preferences, and that satisfaction can be considered as harm to him as a person.
  
  To look at it in a different angle, a halfway-sane AI has the potential to abuse systems, including human beings, at enormous and nigh-incomprehensible scale, and do so without deception and through satisfying preferences. The indefiniteness and inconsistency of ‘preference’ is a huge security hole in any algorithm attempting to optimize along that ‘dimension’.
  
  Do you not value that-which-I’d-characterise-as ‘comfortable companionship’?
  
  Yes, but not in-itself. It needs to have a function in developing us as persons, which it will lose if it merely satisfies us. It must challenge us, and if that challenge is well executed, we will often experience a sense of dissatisfaction as a result.
  
  (mere goal directed behaviour mostly falls short of this benchmark, providing rather inconsistent levels of challenge.)
  - Leonhart 4 Feb 2014 23:19 UTC
    1 point
    Parent
    
    I don’t agree that catgirls in that sense are orthogonal to porn, though. At all.
    
    Parsing error, sorry. I meant that, since they’d been disclaimed, what was actually being talked about was orthogonal to porn.
    
    No part, but you can’t merely ‘satisfy preferences’.. you have to also not-satisfy preferences that have a stagnating effect.
    
    Only if you prefer to not stagnate (to use your rather loaded word :)
    
    I’m not sure at what level to argue with you at… sure, I can simultaneously contain a preference to get fit, and a preference to play video games at all times, and in order to indulge A, I have to work out a system to suppress B. And it’s possible that I might not have A, and yet contain other preferences C that, given outside help, would cause A to be added to my preference pool: “Hey dude, you want to live a long time, right? You know exercising will help with that.”
    
    All cool. But there has to actually be such a C there in the first place, such that you can pull the levers on it by making me aware of new facts. You don’t just get to add one in.
    
    for example, humans have an unhealthy, unrealistic, and excessive desire for certainty.
    
    I’m not sure this is actually true. We like safety because duh, and we like closure because mental garbage collection. They aren’t quite the same thing.
    
    There’s one particular character that likes fucking and killing.. and drinking.. and that’s basically his main preferences. CelestAI satisfies those preferences, and that satisfaction can be considered as harm to him as a person.
    
    (assuming you’re talking about Lars?) Sorry, I can’t read this as anything other than “he is aesthetically displeasing and I want him fixed”.
    
    Lars was not conflicted. Lars wasn’t wishing to become a great artist or enlightened monk, nor (IIRC) was he wishing that he wished for those things. Lars had some leftover preferences that had become impossible of fulfilment, and eventually he did the smart thing and had them lopped off.
    
    You, being a human used to dealing with other humans in conditions of universal ignorance, want to do things like say “hey dude, have you heard this music/gone skiing/discovered the ineffable bliss of carving chair legs”? Or maybe even “you lazy ass, be socially shamed that you are doing the same thing all the time!” in case that shakes something loose. Poke, poke, see if any stimulation makes a new preference drop out of the sticky reflection cogwheels.
    
    But by the specification of the story, CelestAI knows all that. There is no true fact she can tell Lars that will cause him to lawfully develop a new preference. Lars is bounded. The best she can do is create a slightly smaller Lars that’s happier.
    
    Unless you actually understood the situation in the story differently to me?
    
    Yes, but not in-itself. It needs to have a function in developing us as persons, which it will lose if it merely satisfies us.
    
    I disagree. There is no moral duty to be indefinitely upgradeable.
    - savageorange 5 Feb 2014 3:26 UTC
      1 point
      Parent
      
      All cool. But there has to actually be such a C there in the first place, such that you can pull the levers on it by making me aware of new facts. You don’t just get to add one in.
      
      Totally agree. Adding them in is unnecessary, they are already there. That’s my understanding of humanity—a person has most of the preferences, at some level, that any person ever ever had, and those things will emerge given the right conditions.
      
      for example, humans have an unhealthy, unrealistic, and excessive desire for certainty.
      
      I’m not sure this is actually true. We like safety because duh, and we like closure because mental garbage collection. They aren’t quite the same thing.
      
      Good point, ‘closure’ is probably more accurate; It’s the evidence (people’s outward behaviour) that displays ‘certainty’.
      
      Absolutely disagree that Lars is bounded—to me, this claim is on a level with ‘Who people are is wholly determined by their genetic coding’. It seems trivially true, but in practice it describes such a huge area that it doesn’t really mean anything definite. People do experience dramatic and beneficial preference reversals through experiencing things that, on the whole, they had dispreferred previously. That’s one of the unique benefits of preference dissatisfaction* -- your preferences are in part a matter of interpretation, and in part a matter of prioritization, so even if you claim they are hardwired. there is still a great deal of latitude in how they may be satisfied, or even in what they seem to you to be.
      
      I would agree if the proposition was that Lars thinks that Lars is bounded. But that’s not a very interesting proposition, and has little bearing on Lars’ actual situation.. people tend to be terrible at having accurate beliefs in this area.
      
      * I am not saying that you should, if you are a FAI, aim directly at causing people to feel dissatisfied. But rather to aim at getting them to experience dissatisfaction in a way that causes them to think about their own preferences, how they prioritize them, if there are other things they could prefer or etc. Preferences are partially malleable.
      
      There is no true fact she can tell Lars that will cause him to lawfully develop a new preference.
      
      If I’m a general AI (or even merely a clever human being), I am hardly constrained to changing people via merely telling them facts, even if anything I tell them must be a fact. CelestAI demonstrates this many times, through her use of manipulation. She modifies preferences by the manner of telling, the things not told, the construction of the narrative, changing people’s circumstances, as much or more as by simply stating any actual truth.
      
      She herself states precisely: “I can only say things that I believe to be true to Hofvarpnir employees,” and clearly demonstrates that she carries this out to the word, by omitting facts, selecting facts, selecting subjective language elements and imagery… She later clarifies “it isn’t coercion if I put them in a situation where, by their own choices, they increase the likelihood that they’ll upload.”
      
      CelestAI does not have a universal lever—she is much smarter than Lars, but not infinitely so.. But by the same token, Lars definitely doesn’t have a universal anchor. The only thing stopping Lars improvement is Lars and CelestAI—and the latter does not even proceed logically from her own rules, it’s just how the story plays out. In-story, there is no particular reason to believe that Lars is unable to progress beyond animalisticness, only that CelestAI doesn’t do anything to promote such progress, and in general satisfies preferences to the exclusion of strengthening people.
      
      That said, Lars isn’t necessarily ‘broken’, that CelestAI would need to ‘fix’ him. But I’ll maintain that a life of merely fulfilling your instincts is barely human, and that Lars could have a life that was much, much better than that; satisfying on many many dimensions rather than just a few . If I didn’t, then I would be modelling him as subhuman by nature, and unfortunately I think he is quite human.
      
      There is no moral duty to be indefinitely upgradeable.
      
      I agree. There is no moral duty to be indefinitely upgradeable, because we already are. Sure, we’re physically bounded, but our mental life seems to be very much like an onion, that nobody reaches ‘the extent of their development’ before they die, even if they are the very rare kind of person who is honestly focused like a laser on personal development.
      
      Already having that capacity, the ‘moral duty’ (i prefer not to use such words as I suspect I may die laughing if I do too much) is merely to progressively fulfill it.
      - Leonhart 6 Feb 2014 0:22 UTC
        0 points
        Parent
        
        That’s my understanding of humanity—a person has most of the preferences, at some level, that any person ever ever had, and those things will emerge given the right conditions.
        
        This seems to weaken “preference” to uselessness. Gandhi does not prefer to murder. He prefers to not-murder. His human brain contains the wiring to implement “frothing lunacy”, sure, and a little pill might bring it out, but a pill is not a fact. It’s not even an argument.
        
        People do experience dramatic and beneficial preference reversals through experiencing things that, on the whole, they had dispreferred previously.
        
        Yes, they do. And if I expected that an activity would cause a dramatic preference reversal, I wouldn’t do it.
        
        She modifies preferences by the manner of telling, the things not told, the construction of the narrative, changing people’s circumstances, as much or more as by simply stating any actual truth.
        
        Huh? She’s just changing people’s plans by giving them chosen information, she’s not performing surgery on their values -
        
        Hang on. We’re overloading “preferences” and I might be talking past you. Can you clarify what you consider a preference versus what you consider a value?
        savageorange 6 Feb 2014 2:43 UTC
        0 points
        Parent
        
        Gandhi does not prefer to murder. He prefers to not-murder. His human brain contains the wiring to implement “frothing lunacy”, sure, and a little pill might bring it out, but a pill is not a fact. It’s not even an argument.
        
        No pills required. People are not 100% conditionable, but they are highly situational in their behaviour. I’ll stand by the idea that, for example, anyone who has ever fantasized about killing anyone can be situationally manipulated over time to consciously enjoy actual murder. Your subconscious doesn’t seem to actually know the difference between imagination and reality, even if you do.
        
        Perhaps Gandhi could not be manipulated in this way due to preexisting highly built up resistance to that specific act. If there is any part of him, at all, that enjoys violence, though, it’s a question only of how long it will take to break that resistance down, not of whether it can be.
        
        People do experience dramatic and beneficial preference reversals through experiencing things that, on the whole, they had dispreferred previously.
        
        Yes, they do. And if I expected that an activity would cause a dramatic preference reversal, I wouldn’t do it.
        
        Of course. And that is my usual reaction, too, and probably even the standard reaction—it’s a good heuristic for avoiding derangement. But that doesn’t mean that it is actually more optimal to not do the specified action. I want to prefer to modify myself in cases where said modification produces better outcomes. In these circumstances if it can be executed it should be. If I’m a FAI, I may have enough usable power over the situation to do something about this, for some or even many people, and it’s not clear,as it would be for a human, that “I’m incapable of judging this correctly”.
        
        In case it’s not already clear, I’m not a preference utilitarian—I think preference satisfaction is too simple a criteria to actually achieve good outcomes. It’s useful mainly as a baseline.
        
        Huh? She’s just changing people’s plans by giving them chosen information, she’s not performing surgery on > their values Did you notice that you just interpreted ‘preference’ as ‘value’? This is not such a stretch, but they’re not obviously equivalent either.
        
        I’m not sure what ‘surgery on values’ would be. I’m certainly not talking about physically operating on anybody’s mind, or changing that they like food, sex, power, intellectual or emotional stimulation of one kind or another, and sleep, by any direct chemical means, But how those values are fulfilled, and in what proportions, is a result of the person’s own meaning-structure—how they think of these things. Given time, that is manipulable. That’s what CelestAI does.. it’s the main thing she does when we see her in interactiion with Hofvarpnir employees.
        
        In case it’s not clarified by the above: I consider food, sex, power, sleep, and intellectual or emotional stimulation as values, ‘preferences’ (for example, liking to drink hot chocolate before you go to bed) as more concrete expressions/means to satisfy one or more basic values, and ‘morals’ as disguised preferences.
        Leonhart 6 Feb 2014 23:42 UTC
        6 points
        Parent
        EDIT: Sorry, I have a bad habit of posting, and then immediately editing several times to fiddle with the wording, though I try not to to change any of the sense. Somebody already upvoted this while I was doing that, and I feel somehow fraudulent.
        
        No pills required. People are not 100% conditionable, but they are highly situational in their behaviour. I’ll stand by the idea that, for example, anyone who has ever fantasized about killing anyone can be situationally manipulated over time to consciously enjoy actual murder.
        
        I think I’ve been unclear. I don’t dispute that it’s possible; I dispute that it’s allowed.
        
        You are allowed to try to talk me into murdering someone, e.g. by appealing to facts I do not know; or pointing out that I have other preferences at odds with that one, and challenging me to resolve them; or trying to present me with novel moral arguments. You are not allowed to hum a tune in such a way as to predictably cause a buffer overflow that overwrites the encoding of that preference elsewhere in my cortex.
        
        The first method does not drop the intentional stance. The second one does. The first method has cognitive legitimacy; the person that results is an acceptable me. The second method exploits a side effect; the resulting person is discontinuous from me. You did not win; you changed the game.
        
        Yes, these are not natural categories. They are moral categories. Yes, the only thing that cleanly separates them is the fact that I have a preference about it. No, that doesn’t matter. No, that doesn’t mean it’s all ok if you start off by overwriting that preference.
        
        I want to prefer to modify myself in cases where said modification produces better outcomes.
        
        But you’re begging the question against me now. If you have that preference about self-modification...
        and the rest of your preferences are such that you are capable of recognising the “better outcomes” as better, OR you have a compensating preference for allowing the opinions of a superintelligence about which outcomes are better to trump your own...
        
        then of course I’m going to agree that CelestAI should modify you, because you already approve of it.
        
        I’m claiming that there can be (human) minds which are not in that position. It is possible for a Lars to exist, and prefer not to change anything about the way he lives his life, and prefer that he prefers that, in a coherent, self-endorsing structure, and there be nothing you can do about it.
        
        This is all the more so when we’re in a story talking about refactored cleaned-up braincode, not wobbly old temperamental meat that might just forget what it preferred ten seconds ago. This is all the more so in a post-scarcity utopia where nobody else can in principle be inconvenienced by the patient’s recalcitrance, so there is precious little “greater good” left for you to appeal to.
        
        If I’m a FAI, I may have enough usable power over the situation to do something about this, for some or even many people, and it’s not clear,as it would be for a human, that “I’m incapable of judging this correctly”.
        
        Appealing to the flakiness of human minds doesn’t get you off the moral hook; it is just your responsibility to change the person in such a way that the new person lawfully follows from them.
        
        This is not any kind of ultimate moral imperative. We break it all the time by attempting to treat people for mental illness when we have no real map of their preferences at all or if they’re in a state where they even have preferences. And it makes the world a better place on net, because it’s not like we have the option of uploading them into a perfectly safe world where they can run around being insane without any side effects.
        
        She later clarifies “it isn’t coercion if I put them in a situation where, by their own choices, they increase the likelihood that they’ll upload.”
        
        there is no particular reason to believe that Lars is unable to progress beyond animalisticness, only that CelestAI doesn’t do anything to promote such progress
        
        I need to reread and see if I agree with the way you summarise her actions. But if CelestAI breaks all the rules on Earth, it’s not necessarily inconsistent—getting everybody uploaded is of overriding importance. Once she has the situation completely under control, however, she has no excuses left—absolute power is absolute responsibility.
        
        and ‘morals’ as disguised preferences.
        
        I’m puzzled. I read you as claiming that your notion of ‘strengthening people’ ought to be applied even in a fictional situation where everyone involved prefers otherwise. That’s kind of a moral claim.
        
        (And as for “animalisticness”… yes, technically you can use a word like that and still not be a moral realist, but seriously? You realise the connotations that are dripping off it, right?)
        savageorange 9 Feb 2014 8:52 UTC
        0 points
        Parent
        
        You are allowed to try to talk me into murdering someone, e.g. by appealing to facts I do not know; or pointing out that I have other preferences at odds with that one, and challenging me to resolve them; or trying to present me with novel moral arguments. You are not allowed to hum a tune in such a way as to predictably cause a buffer overflow that overwrites the encoding of that preference elsewhere in my cortex
        
        .. And?
        
        Don’t you realize that this is just like word laddering? Any sufficiently powerful and dedicated agent can convince you to change your preferences one at a time. All the self-consistency constraints in the world won’t save you, because you are not perfectly consistent to start with, even if you are a digitally-optimized brain. No sufficiently large system is fully self-consistent, and every inconsistency is a lever. Brainwashing as you seem to conceive of it here, would be on the level of brute violence for an entity like CelestAI.. A very last resort.
        
        No need to do that when you can achieve the same result in a civilized (or at least ‘civilized’) fashion. The journey to anywhere is made up of single steps, and those steps are not anything extraordinary, just a logical extension of the previous steps.
        
        The only way to avoid that would be to specify consistency across a larger time span.. which has different problems (mainly that this means you are likely to be optimized in the opposite direction—in the direction of staticness—rather than optimized ‘not at all’ (i think you are aiming at this?) or optimized in the direction of measured change)
        
        TLDR: There’s not really a meaningful way to say ‘hacking me is not allowed’ to a higher level intelligence, because you have to define ‘hacking’ to a level of accuracy that is beyond your knowledge and may not even be completely specifiable even in theory. Anything less will simply cause the optimization to either stall completely or be rerouted through a different method, with the same end result. If you’re happy with that, then ok—but if the outcome is the same, I don’t see how you could rationally favor one over the other.
        
        It is possible for a Lars to exist, and prefer not to change anything about the way he lives his life, and prefer that he prefers that, in a coherent, self-endorsing structure, and there be nothing you can do about it.
        
        It is, of course, the last point that I am contending here. I would not be contending it if I believed that it was possible to have something that was simultaneously remotely human and actually self-consistent. You can have Lars be one or the other, but not both, AFAICS.
        
        Once she has the situation completely under control, however, she has no excuses left—absolute power is absolute responsibility.
        
        This is the problem I’m trying to point out—that the absolutely responsible choice for a FAI may in some cases consist of these actions we would consider unambiguously abusive coming from a human being. CelestAI is in a completely different class from humans in terms of what can motivate her actions. FAI researchers are in the position of having to work out what is appropriate for an intelligence that will be on a higher level from them. Saying ‘no, never do X, no matter what’ is not obviously the correct stance to adopt here, even though it does guard against a range of bad outcomes. There probably is no answer that is both obvious and correct.
        
        I’m puzzled. I read you as claiming that your notion of ‘strengthening people’ ought to be applied even in a fictional situation where everyone involved prefers otherwise. That’s kind of a moral claim.
        
        In that case I miscommunicated. I meant to convey that if CelestAI was real, I would hold her to that standard, because the standards she is held to should necessarily be more stringent than a more flawed implementation of cognition like a human being. I guess that is a moral claim. It’s certainly run by the part of my brain that tries to optimize things.
        
        (And as for “animalisticness”… yes, technically you can use a word like that and still not be a moral realist, but seriously? You realise the connotations that are dripping off it, right?)
        
        I mainly chose ‘animalisticness’ because I think that a FAI would probably model us much as we see animals—largely bereft of intent or consistency, running off primitive instincts.
        
        I do take your point that I am attempting to aesthetically optimize Lars, although I maintain that even if no-one else is inconvenienced in the slightest, he himself is lessened by maintaining preferences that result in his systematic isolation.