There are two basic theories of moral progress (although it is possible to form mixtures):
Morals change over time in some random or arbitrary fashion, without any objective “improvement”. The only reason morals seem to improve over time is because along the past timeline, as time progresses it becomes closer to our time.
Morals improve over time in some objective sense. That is, the change in morals is the result of better epistemic knowledge and more introspection and deliberation.
If you believe in theory 1 then there is nothing wrong with locking the values. Yes, future generations would have different morals without the lock, so what? I care about my own values, not the values of future generations (that’s why they’re called my values).
If you believe in theory 2 then the AGI will perform the extrapolation itself (as in extrapolated volition). It will perform it much better than us since it will gather a lot more epistemic knowledge and will be able to understand our brains much better than we understand them.
If theory 2 was correct, the AI would quickly extrapolate to the best possible version, which would be so alien to us that most of us would find it hellish. If it changed us so that we would accept that world, than we would no longer be “us”.
This reminds me of the novel Three worlds collide. (wow, I’ve read it quite some time ago and never realized until now that it was originally posed here on LessWrong!)
Humans make first contact with an alien species which have their morals based on eating most of their fully conscious children (it makes sense in context). Of course humans find it most ethical to either genocide them or forcefully change them so that the suffering of children can be stopped… but then we encounter another species which eliminated all kind of pain, both emotional and physical, and finds us just as abhorrent as we found the “baby-eaters”. And they want to change us into blobs which don’t feel any emotions and exist in a blissful state of constant orgy.
If you believe that morals improve over time in some objective sense, then you should examine how morals change by the passage of time, figure out which morals are going to win, and adopt those morals immediately.
Regarding “changing of human values”.
There are two basic theories of moral progress (although it is possible to form mixtures):
Morals change over time in some random or arbitrary fashion, without any objective “improvement”. The only reason morals seem to improve over time is because along the past timeline, as time progresses it becomes closer to our time.
Morals improve over time in some objective sense. That is, the change in morals is the result of better epistemic knowledge and more introspection and deliberation.
If you believe in theory 1 then there is nothing wrong with locking the values. Yes, future generations would have different morals without the lock, so what? I care about my own values, not the values of future generations (that’s why they’re called my values).
If you believe in theory 2 then the AGI will perform the extrapolation itself (as in extrapolated volition). It will perform it much better than us since it will gather a lot more epistemic knowledge and will be able to understand our brains much better than we understand them.
If theory 2 was correct, the AI would quickly extrapolate to the best possible version, which would be so alien to us that most of us would find it hellish. If it changed us so that we would accept that world, than we would no longer be “us”.
This reminds me of the novel Three worlds collide. (wow, I’ve read it quite some time ago and never realized until now that it was originally posed here on LessWrong!)
Humans make first contact with an alien species which have their morals based on eating most of their fully conscious children (it makes sense in context). Of course humans find it most ethical to either genocide them or forcefully change them so that the suffering of children can be stopped… but then we encounter another species which eliminated all kind of pain, both emotional and physical, and finds us just as abhorrent as we found the “baby-eaters”. And they want to change us into blobs which don’t feel any emotions and exist in a blissful state of constant orgy.
Presumably the extrapolated values will also include preference for slow, gradual transitions. Therefore the AI will make the transition gradual.
If you believe that morals improve over time in some objective sense, then you should examine how morals change by the passage of time, figure out which morals are going to win, and adopt those morals immediately.