Shmi comments on An overall schema for the friendly AI problems: self-referential convergence criteria

Shmi 13 Jul 2015 15:15 UTC
17 points
0
As you mention, so far every attempt by humans to have a self-consistent value system (the process also known as decompartmentalization) results in less-than-desirable outcomes. What if the end goal of having a thriving long-lasting (super-)human(-like) society is self-contradictory, and there is no such thing as both “nice” and “self-referentially stable”? Maybe some effort should be put into figuring out how to live, and thrive, while managing the unstable self-reference and possibly avoid convergence altogether.
- Kaj_Sotala 15 Jul 2015 6:59 UTC
  13 points
  0
  Parent
  A thought I’ve been thinking of lately, derived from a reinforcement learning view of values, and also somewhat inspired by Nate’s recent post on resting in motion… - value convergence seems to suggest a static endpoint, with some set of “ultimate values” we’ll eventually reach and have ever after. But so far societies have never reached such a point, and if our values are an adaptation to our environment (including the society and culture we live in), then it would suggest that as long as we keep evolving and developing, our values will keep changing and evolving with us, without there being any meaningful endpoint.
  
  There will always (given our current understanding of physics) be only a finite amount of resources available, and unless we either all merge into one enormous hivemind or get turned into paperclips, there will likely be various agents with differing preferences on what exactly to do with those resources. As the population keeps changing and evolving, the various agents will keep acquiring new kinds of values, and society will keep rearranging itself to a new compromise between all those different values. (See: the whole history of the human species so far.)
  
  Possibly we shouldn’t so much try to figure out what we’d prefer the final state to look like, but rather what we’d prefer the overall process to look like.
  
  (The bias towards trying to figure out a convergent end-result for morality might have come from LW’s historical tendency to talk and think in terms of utility functions, which implicitly assume a static and unchanging set of preferences, glossing over the fact that human preferences keep constantly changing.)
  - [deleted] 18 Jul 2015 19:26 UTC
    2 points
    0
    Parent
    
    Possibly we shouldn’t so much try to figure out what we’d prefer the final state to look like, but rather what we’d prefer the overall process to look like.
    
    Well, the general Good Idea in that model is that events or actions shouldn’t be optimized to drift faster or more discontinuously than people’s valuations of those events, so that the society existing at any given time is more-or-less getting what it wants while also evolving towards something else.
    
    Of course, a compromise between the different “values” (scare-quotes because I don’t think the moral-philosophy usage of the word points at anything real) of society’s citizens is still a vast improvement on “a few people dominate everyone else and impose their own desires by force and indoctrination”, which is what we still have to a great extent.
  - David_Bolin 18 Jul 2015 13:56 UTC
    0 points
    0
    Parent
    This sounds like Robin Hanson’s idea of the future. Eliezer would probably agree that in theory this would happen, except that he expects one superintelligent AI to take over everything and impose its values on the entire future of everything. If Eliezer’s future is definitely going to happen, then even if there is no truly ideal set of values, we would still have to make sure that the values that are going to be imposed on everything are at least somewhat acceptable.
  - jacob_cannell 16 Jul 2015 20:38 UTC
    0 points
    0
    Parent
    This. Values evolve, like everything else. Evolution will continue in the posthuman era.
    - Lumifer 16 Jul 2015 20:52 UTC
      −2 points
      0
      Parent
      Evolution requires selection pressure. The failures have to die out. What will provide the selection pressure in the posthuman era?
      - gjm 17 Jul 2015 16:41 UTC
        2 points
        0
        Parent
        “Evolve” has (at least) two meanings. One is the Darwinian one where heritable variation and selection lead to (typically) ever-better-adapted entities. But “evolve” can also just mean “vary gradually”. It could be that values aren’t (or wouldn’t be, in a posthuman era) subject to anything much like biological evolution; but they still might vary. (In biological terms, I suppose that would be neutral drift.)
        Lumifer 17 Jul 2015 16:52 UTC
        −1 points
        0
        Parent
        Well, we are talking about the Darwinian meaning, aren’t we? “Vary gradually”, aka “drift” is not contentious at all.
        gjm 17 Jul 2015 16:57 UTC
        2 points
        0
        Parent
        I’m not sure we are talking specifically about the Darwinian meaning, actually. Well, I guess you are, given your comment above! But I don’t think the rest of the discussion was so specific. Kaj_Sotala said:
        
        if our values are an adaptation to our environment (including the society and culture we live in), then it would suggest that as long as we keep evolving and developing, our values will keep changing and evolving with us, without there being any meaningful endpoint.
        
        which seems to me to describe a situation of gradual change in our values that doesn’t need to be driven by anything much like biological evolution. (E.g., it could happen because each generation’s people constantly make small more-or-less-deliberate adjustments in their values to suit the environment they find themselves in.)
        
        (Kaj’s comment does actually describe a resource-constrained situation, but the resource constraints aren’t directly driving the evolution of values he describes.)
        Lumifer 17 Jul 2015 17:06 UTC
        3 points
        0
        Parent
        We’re descending into nit-pickery. The question of whether values will change in the future is a silly one, as the answer “Yes” is obvious. The question of whether values will evolve in the Darwinian sense in the posthuman era (with its presumed lack of scarcity, etc.) is considerably more interesting.
        gjm 17 Jul 2015 18:13 UTC
        3 points
        0
        Parent
        I agree that it’s more interesting. But I’m not sure it was the question actually under discussion.
      - gjm 17 Jul 2015 16:20 UTC
        0 points
        0
        Parent
        
        The failures have to die out.
        
        I’m not sure that’s true. Imagine some glorious postbiological future in which people (or animals or ideas or whatever) can reproduce without limit. There are two competing replicators A and B, and the only difference is that A replicates slightly faster than B. After a while there will be vastly more of A around than of B, even if nothing dies. For many purposes, that might be enough.
        Lumifer 17 Jul 2015 16:22 UTC
        0 points
        0
        Parent
        
        After a while there will be vastly more of A around than of B
        
        So, in this scenario, what evolved?
        gjm 17 Jul 2015 16:53 UTC
        2 points
        0
        Parent
        The distribution of A and B in the population.
        Lumifer 17 Jul 2015 16:54 UTC
        −2 points
        0
        Parent
        I don’t think this is an appropriate use of the word “evolution”.
        gjm 17 Jul 2015 18:09 UTC
        7 points
        0
        Parent
        Why not? It’s a standard one in the biological context. E.g.,
        
        “In fact, evolution can be precisely defined as any change in the frequency of alleles within a gene pool from one generation to the next.”
        
        which according to a talk.origins FAQ is from this textbook: Helena Curtis and N. Sue Barnes, Biology, 5th ed. 1989 Worth Publishers, p.974
      - jacob_cannell 17 Jul 2015 15:51 UTC
        0 points
        0
        Parent
        Economics. Posthumans still require mass/energy to store/compute their thoughts.
      - hyporational 17 Jul 2015 5:52 UTC
        0 points
        0
        Parent
        If there are mistakes made or the environment requires adaptation, a sufficiently flexible intelligence can mediate the selection pressure.
        Lumifer 17 Jul 2015 14:39 UTC
        0 points
        0
        Parent
        The end result still has to be for the failures to die or be castrated.
        
        There is no problem with saying that values in future will “change” or “drift”, but “evolve” is more specific and I’m not sure how will it work.
        jacob_cannell 17 Jul 2015 15:52 UTC
        −1 points
        0
        Parent
        Memetic evolution, not genetic.
        Lumifer 17 Jul 2015 15:55 UTC
        −1 points
        0
        Parent
        I understand that. Memes can die or be castrated, too :-/
        jacob_cannell 17 Jul 2015 20:34 UTC
        −1 points
        0
        Parent
        In your earlier comment you said “evolution requires selection pressure”. There is of course selection pressure in memetic evolution. Completely eliminating memetic selection pressure is not even wrong—because memetic selection is closely connected to learning or knowledge creation. You can’t get rid of it.
- hairyfigment 13 Jul 2015 19:29 UTC
  8 points
  0
  Parent
  Godsdammit, people, “thrive” is the whole problem.
  - Shmi 13 Jul 2015 21:23 UTC
    1 point
    0
    Parent
    Yes, yes it is. Even once you can order all the central examples of thriving, the “mere addition” operation will tip them toward the noncentral repugnant ones. Hence why one might have to live with the lack of self-consistency.
    - [deleted] 16 Jul 2015 16:05 UTC
      3 points
      0
      Parent
      You could just not be utilitarian, especially in the specific form of not maximizing a metaphysical quantity like “happy experience”, thus leaving you with no moral obligations to counterfactual (ie: nonexistent) people, thus eliminating the Mere Addition Paradox.
      
      Ok, I know that given the chemistry involved in “happy”, it’s not exactly a metaphysical or non-natural quantity, but it bugs me that utilitarianism says to “maximize Happy” even when, precisely as in the Mere Addition Paradox, no individual consciousness will actually experience the magnitude of Happy attained via utilitarian policies. How can a numerical measure of a subjective state of consciousness be valuable if nobody experiences the total numerical measure? It seems more sensible to restrict yourself to only moralizing about people who already exist, thus winding up closer to post-hoc consequentialism than traditional utilitarianism.
      - Shmi 16 Jul 2015 16:37 UTC
        0 points
        0
        Parent
        
        How can a numerical measure of a subjective state of consciousness be valuable if nobody experiences the total numerical measure?
        
        The mere addition paradox also manifests for a single person. Imagine the state you are in. Now imagine if it can be (subjectively) improved by some means (e.g. fame, company, drugs, …). Keep going. Odds are, you would not find a maximum, not even a local one. After a while, you might notice that, despite incremental improvements, the state you are in is actually inferior to the original, if you compare them directly. Mathematically, one might model this as the improvement drive being non-conservative and so no scalar map from states to scalar utility exists. Whether it is worth pushing this analogy any further, I am not sure.
        [deleted] 17 Jul 2015 1:27 UTC
        0 points
        0
        Parent
        
        The mere addition paradox also manifests for a single person. Imagine the state you are in. Now imagine if it can be (subjectively) improved by some means (e.g. fame, company, drugs, …). Keep going. Odds are, you would not find a maximum, not even a local one.
        
        Hill climbing always finds a local maximum, but that might well look very disappointing, wasteful of effort, and downright stupid when compared to some smarter means of spending the effort on finding a way to live a better life.