KatjaGrace comments on Superintelligence 23: Coherent extrapolated volition

KatjaGrace 17 Feb 2015 2:06 UTC
6 points
What research questions would you pursue if you were committed to researching this area?
- diegocaleiro 17 Feb 2015 3:12 UTC
  9 points
  Parent
  One interesting question is, when deciding to CEV people, from which era to extract the people. Saying that all eras would result in the same CEV is equivalent to saying that there is a fundamental correlation between the course of history and coherence which has but one final telos. An unlikely hypothesis, due to all sorts of things from evolutionary drift—as opposed to convergence—to orthogonality in ethics, to reference class tennis.
  
  So researching how to distribute the allocation of CEV among individuals and groups would be a fascinating area to delve into.
  - PhilGoetz 17 Feb 2015 5:55 UTC
    4 points
    Parent
    But if CEV doesn’t give the same result when seeded with humans from any time period in history, I think that means it doesn’t work, or else that human values aren’t coherent enough for it to be worth trying.
    - William_S 17 Feb 2015 23:01 UTC
      5 points
      Parent
      Hmm, maybe one could try to test the CEV implementation by running it on historical human values and seeing whether it approaches modern human values (when not run all the way to convergence).
    - diegocaleiro 17 Feb 2015 17:07 UTC
      4 points
      Parent
      Well think about the world in which most of it turns out pretty similar, but some, say 2% to 20%, depends on historical circumstance (and where that is cast once CEVed), I think we may live in a world like that.
    - Sebastian_Hagen 17 Feb 2015 22:26 UTC
      3 points
      Parent
      That seems wrong.
      
      As a counterexample, consider a hypothetical morality development model where as history advances, human morality keeps accumulating invariants, in a largely unpredictable (chaotic) fashion. In that case modern morality would have more invariants than that of earlier generations. You could implement a CEV from any time period, but earlier time periods would lead to some consequences that by present standards are very bad, and would predictably remain very bad in the future; nevertheless, a present-humans CEV would still work just fine.
      - PhilGoetz 20 Feb 2015 19:16 UTC
        3 points
        Parent
        I don’t know what you mean by invariants, or why you think they’re good, but: If the natural development from this earlier time period, unconstrained by CEV, did better than CEV from that time period would have, that means CEV is worse than doing nothing at all.
        Sebastian_Hagen 21 Feb 2015 19:01 UTC
        1 point
        Parent
        I used “invariant” here to mean “moral claim that will hold for all successor moralities”.
        
        A vastly simplified example: at t=0, morality is completely undefined. At t=1, people decide that death is bad, and lock this in indefinitely. At t=2, people decide that pleasure is good, and lock that in indefinitely. Etc.
        
        An agent operating in a society that develops morality like that, looking back, would want to have all the accidents that lead to current morality to be maintained, but looking forward may not particularly care about how the remaining free choices come out. CEV in that kind of environment can work just fine, and someone implementing it in that situation would want to target it specifically at people from their own time period.
    - Luke_A_Somers 17 Feb 2015 16:49 UTC
      3 points
      Parent
      Or else, the humans values we care about are, say, ours (taken as broadly as possible, but not broader than that).
- PhilGoetz 17 Feb 2015 5:53 UTC
  3 points
  Parent
  I think the two most-important decisions are:
  1. Build a single AI and give it ultimate power, or build a stable ecosystem / balance of power between AIs?
  2. Try to pass on specific values of ours, or try to ensure that life continues operating under parameters that produce some beings that have values something like that?
  Each of these decisions suggests research questions.
  
  1a. How can we extend our models of competition to hierarchical agents—agents that are composed of other agents? Is most of the competition at the top level, or at the lower levels? (For starters, is there some natural distribution of number of agents of different sizes / levels / timescales, like there is for cities of different sizes?) The purpose is to ask whether we can maintain useful competition within a singleton.
  
  1b. For some set of competing hierarchical AIs, what circumstances make it more likely for one to conquer and subsume the others? Under what circumstances might a singleton AI split up into multiple AIs? The purpose is to estimate whether it’s possible to indefinitely avoid permanent collapse into a singleton.
  
  2a. Try to find a candidate set of human values. Find how each is implemented neurally. The purpose is to see whether such things exist, what sorts of things they are, and whether they’re the sort of things that can be implemented in a logic.
  
  2b. List the behaviors of a wide variety of animals. Find values/preferences/behaviors of interest, and for each, find the conditions that tend to lead animals to have / not have those behaviors, as I did for boredom in this comment. The purpose is to see what fraction of the space of behaviors is acceptable to us, and to discover the evolutionary conditions that lead to that fraction of that space. That will give us an idea of how tightly we can constrain future values by controlling the gross parameters of the ecosystem.
  - Mark_Friedenbach 17 Feb 2015 9:04 UTC
    3 points
    Parent
    Or 3) Don’t pass control to AIs at all. Don’t even build agent-y AIs. Augment humans instead.
    - PhilGoetz 17 Feb 2015 17:01 UTC
      2 points
      Parent
      This may be a good way to start, but it eventually leads to the same place.
      - Mark_Friedenbach 17 Feb 2015 21:05 UTC
        7 points
        Parent
        I think you’ll need to explain that because I don’t see that at all. We’ve made life a lot better for most people on this planet by creating power-sharing arrangements that limit any single person’s autocratic powers, and expanding franchise to all. Yet I see many people here advocating basically a return to autocratic rule by our AI overlords, with no vote for the humans left behind. Essentially, “let’s build a provably beneficial dictator!” This boggles my mind.
        
        The alternative is to decentralize transhumanist technology and push as many people a possible through an augmentation pathway in lockstep, preserving our democratic power structures. This sidesteps the friendly AI problem entirely.
        PhilGoetz 19 Feb 2015 19:49 UTC
        2 points
        Parent
        
        Essentially, “let’s build a provably beneficial dictator!” This boggles my mind.
        
        Agreed, though I’m probably boggled for different reasons.
        
        Eventually, the software will develop to the point where the human brain will be only a tiny portion of it. Or somebody will create an AI not attached to a human. The body we know will be left behind or marginalized. There’s a whole universe out there, the vast majority of it uninhabitable by humans.
        Mark_Friedenbach 21 Feb 2015 18:14 UTC
        1 point
        Parent
        
        Eventually, the software will develop to the point where the human brain will be only a tiny portion of it.
        
        “The software”? What software? The “software” is the human, in an augmented human. I’m not sure whatever distinction you’re drawing here is relevant.
        KatjaGrace 23 Feb 2015 21:34 UTC
        1 point
        Parent
        Presumably ‘the software’ is the software that was not part of the original human.
- TheAncientGeek 18 Feb 2015 15:52 UTC
  1 point
  Parent
  Researching CEV as a foregone conclusiom, or researching whether it is a good idea?
  - KatjaGrace 20 Feb 2015 19:44 UTC
    1 point
    Parent
    Either