Eliezer Yudkowsky comments on Ideal Advisor Theories and Personal CEV

Eliezer Yudkowsky 25 Dec 2012 23:33 UTC
14 points
It looks to me like Sobel’s fourth objection may stem in behavioral-economics-style terms from prospect theory’s position-relative evaluations of gains and losses, in which losses are more painful than corresponding gains are pleasurable (typically by an empirical factor of around 2 to 2.5).

These position-relative evaluations are already inconsistent, i.e., they can be reliably manipulated in laboratory settings to yield circular preferences. So construing a volition probably already requires (just to end up with a consistent utility function and coherent instrumental strategies) that we transform the position-relative evaluations into outcome evaluations somehow.

The ‘Ideal Advisor’ part would come in at the point where we handed this ‘construed’ volition a veridical copy of the original predictive model of the human. Thus, this new value system could still reliably predict the actual experiences and reactions of the original human, rather than falsely supposing that the actual human would react in the same way as the construed volition would.

So the construed volition would itself have some coherent utility function over experiences the original human could have—it would not see the human’s current state as a huge loss relative to its own position, because it would no longer be evaluating gains and losses. It would also correctly be able to evaluate that the original human would experience various life-improvements as large, joyful gains.

So Sobel’s fourth objection would probably not arise if the process of construing a volition proceeded in that particular fashion, which in turn is not ad-hoc since positional evaluation was already a large source of inconsistency that would have to be transformed into a coherent utility function somehow, and likewise giving the idealized process veridical knowledge of the original human is a basic paradigm of volition (the whole Ideal Advisor setup).

Sobel’s third objection and second objection seem to revolve around how a construed volition operates over its (abstract) model of possible life experiences that could occur to the original human. (This model had better be abstract! We don’t want to inadvertently create people by simulating them in full detail during the process of deciding whether or not to create them.) Suppose we have a construed volition with a coherent utility function, looking over a set of lives that the original human might experience. The amnesia problem is already dissipated if we can pull off this setup; the construed volition does not forget anything. The second problem—the supposed impossibility of choosing between two lives correctly, without actually having led both, but the prospect of leading both introducing an ordering effect—gets us into much thornier territory. Let’s first note that it’s not obvious that the correct judgment is the one you’d make if you’d actually led a certain life, e.g., heroin!addict!Eliezer thinks that heroin is an absolutely great idea, but I don’t want my volition to be construed such that its knowledge of that this overpowering psychological motivation would counterfactually result from heroin addiction, would actually constitute a reason to feed me heroin. I think this points in the direction of an Ideal Advisor ethics wherein construing a volition looks more like modeling how my current values judge future experiences, including my current values over having new desires being fulfilled, more than it points toward construing my volition to have direct empathy with future selves i.e. translation of their own psychological impulses into volitional impetuses of equal strength. This doesn’t so much deal with Sobel’s second objection as pack it into the problem of construing a volition that shows an analogue of my care for my own (and others’) future selves without experiencing ‘direct empathy’ or direct translation of forceful desires. We’re also dancing around the difficulty of having a construed volition which has values over predicted conscious experiences without that volition itself being a bearer of conscious experiences, mostly because I still don’t have any good idea of how to solve that one. Resolving consciousness to be less mysterious hasn’t yet helped me much on figuring out how to accurately model things getting wet without modeling any water.

Sobel’s first problem was a to-do in CEV since day one (the original essay proposed evaluating a spread of possibilities) and I’m willing to point to Bostrom’s parliament as the best model yet offered. There’s no such thing as “too many voices”, just the number of voices you can manage to model on available hardware.
What links here?
- Elithrion's comment on Ideal Advisor Theories and Personal CEV by lukeprog (28 Jan 2013 3:59 UTC; 0 points)