PhilGoetz comments on Superintelligence 23: Coherent extrapolated volition

PhilGoetz 17 Feb 2015 5:55 UTC
4 points
But if CEV doesn’t give the same result when seeded with humans from any time period in history, I think that means it doesn’t work, or else that human values aren’t coherent enough for it to be worth trying.
- William_S 17 Feb 2015 23:01 UTC
  5 points
  Parent
  Hmm, maybe one could try to test the CEV implementation by running it on historical human values and seeing whether it approaches modern human values (when not run all the way to convergence).
- diegocaleiro 17 Feb 2015 17:07 UTC
  4 points
  Parent
  Well think about the world in which most of it turns out pretty similar, but some, say 2% to 20%, depends on historical circumstance (and where that is cast once CEVed), I think we may live in a world like that.
- Sebastian_Hagen 17 Feb 2015 22:26 UTC
  3 points
  Parent
  That seems wrong.
  
  As a counterexample, consider a hypothetical morality development model where as history advances, human morality keeps accumulating invariants, in a largely unpredictable (chaotic) fashion. In that case modern morality would have more invariants than that of earlier generations. You could implement a CEV from any time period, but earlier time periods would lead to some consequences that by present standards are very bad, and would predictably remain very bad in the future; nevertheless, a present-humans CEV would still work just fine.
  - PhilGoetz 20 Feb 2015 19:16 UTC
    3 points
    Parent
    I don’t know what you mean by invariants, or why you think they’re good, but: If the natural development from this earlier time period, unconstrained by CEV, did better than CEV from that time period would have, that means CEV is worse than doing nothing at all.
    - Sebastian_Hagen 21 Feb 2015 19:01 UTC
      1 point
      Parent
      I used “invariant” here to mean “moral claim that will hold for all successor moralities”.
      
      A vastly simplified example: at t=0, morality is completely undefined. At t=1, people decide that death is bad, and lock this in indefinitely. At t=2, people decide that pleasure is good, and lock that in indefinitely. Etc.
      
      An agent operating in a society that develops morality like that, looking back, would want to have all the accidents that lead to current morality to be maintained, but looking forward may not particularly care about how the remaining free choices come out. CEV in that kind of environment can work just fine, and someone implementing it in that situation would want to target it specifically at people from their own time period.
- Luke_A_Somers 17 Feb 2015 16:49 UTC
  3 points
  Parent
  Or else, the humans values we care about are, say, ours (taken as broadly as possible, but not broader than that).