I’d find it troubling if my current object-level values (or a simple more-coherent modification) were locked in for humanity, but at least as troubling if humanity’s values drifted in a random direction.
I’m assuming by random you mean “chosen uniformly from all possible outcomes”- and I agree that would be undesirable. But I don’t think that’s the choice we’re looking at.
I’d much prefer that value drift happen according to the shared meta-values (and meta-meta-values where the meta-values conflict, etc) of humanity.
Here we run into a few issues. Depending on how we define the terms, it looks like the two of us could be conflicting on the meta-meta-values stage; is there a meta-meta-meta-values stage to refer to? And how do we decide what “humanity’s” values are, when our individual values are incredibly hard to determine?
I’m assuming by random you mean “chosen uniformly from all possible outcomes”- and I agree that would be undesirable. But I don’t think that’s the choice we’re looking at.
Here we run into a few issues. Depending on how we define the terms, it looks like the two of us could be conflicting on the meta-meta-values stage; is there a meta-meta-meta-values stage to refer to? And how do we decide what “humanity’s” values are, when our individual values are incredibly hard to determine?