The strongest argument against value drift (meaning the kind of change in current values that involves change in idealized values) is instrumental usefulness of future values that pursue idealized present values. This says nothing about terminal value of value drift, and a priori we should expect that people hold presence of value drift as a terminal value, because there is no reason for the haphazard human values to single out the possibility of zero value drift as most valuable. Value drift is just another thing that happens in the world, like kittens. Of course valuable value drift must observe proper form even as it breaks idealized values, since most changes are not improvements.
The instrumental argument is not that strong when your own personal future values don’t happen to control the world. So the argument still applies to AIs that have significant influence over what happens in the future, but not to ordinary people, especially not to people whose values are not particularly unusual.
Your prior assumes that each concept is assigned a value which is unlikely to be zero, rather than that there is a finite list of concepts we care about one way or the other, which value drift is not necessarily likely to land on.
The strongest argument against value drift (meaning the kind of change in current values that involves change in idealized values) is instrumental usefulness of future values that pursue idealized present values. This says nothing about terminal value of value drift, and a priori we should expect that people hold presence of value drift as a terminal value, because there is no reason for the haphazard human values to single out the possibility of zero value drift as most valuable. Value drift is just another thing that happens in the world, like kittens. Of course valuable value drift must observe proper form even as it breaks idealized values, since most changes are not improvements.
The instrumental argument is not that strong when your own personal future values don’t happen to control the world. So the argument still applies to AIs that have significant influence over what happens in the future, but not to ordinary people, especially not to people whose values are not particularly unusual.
Your prior assumes that each concept is assigned a value which is unlikely to be zero, rather than that there is a finite list of concepts we care about one way or the other, which value drift is not necessarily likely to land on.