The CEV operation tries to return a fixed point of idealized value-reflection. Running immortal people forward inside of a simulated world is very much insufficiently idealized value-reflection, for the reasons you suggest, so simply simulating people interacting for a long time isn’t running their CEV.
How would you run their CEV? I’m saying it’s not obvious how to do it in a way that both captures their actual volition, while avoiding coercion. You’re saying “idealized reflection”, but what does that mean?
Yeah, fair—I dunno. I do know that an incremental improvement on simulating a bunch of people in an environment philosophizing is doing that but running an algorithm that prevents coercion, e.g.
I imagine that the complete theory of these incremental improvements (for example, also not running a bunch of moral patients for many subjective years while computing the CEV), is the final theory we’re after, but I don’t have it.
Like, encoding what “coercion” is would be an expression of values. It’s more meta, and more universalizable, and stuff, but it’s still something that someone might strongly object to, and so it’s coercion in some sense. We could try to talk about what possible reflectively stable people / societies would consider as good rules for the initial reflection process, but it seems like there would be multiple fixed points, and probably some people today would have revealed preferences that distinguish those possible fixed points of reflection, still leaving open conflict.
Then that isn’t the CEV operation.
The CEV operation tries to return a fixed point of idealized value-reflection. Running immortal people forward inside of a simulated world is very much insufficiently idealized value-reflection, for the reasons you suggest, so simply simulating people interacting for a long time isn’t running their CEV.
How would you run their CEV? I’m saying it’s not obvious how to do it in a way that both captures their actual volition, while avoiding coercion. You’re saying “idealized reflection”, but what does that mean?
Yeah, fair—I dunno. I do know that an incremental improvement on simulating a bunch of people in an environment philosophizing is doing that but running an algorithm that prevents coercion, e.g.
I imagine that the complete theory of these incremental improvements (for example, also not running a bunch of moral patients for many subjective years while computing the CEV), is the final theory we’re after, but I don’t have it.
Like, encoding what “coercion” is would be an expression of values. It’s more meta, and more universalizable, and stuff, but it’s still something that someone might strongly object to, and so it’s coercion in some sense. We could try to talk about what possible reflectively stable people / societies would consider as good rules for the initial reflection process, but it seems like there would be multiple fixed points, and probably some people today would have revealed preferences that distinguish those possible fixed points of reflection, still leaving open conflict.
Cf. https://www.lesswrong.com/posts/CzufrvBoawNx9BbBA/how-to-prevent-authoritarian-revolts?commentId=3LcHA6rtfjPEQne4N