It is normal for human values to evolve. If my values were fixed at me at 6 years old, I would be regarded mentally ill.
However, there are normal human speed and directions of value evolution, and there are some ways of value evolution which could be regarded as too quick, too slow, or going in a strange direction. In other words, the speed and direction of the value drift is a normative assumption. For example, i find normal that a person is fascinated with some philosophical system for years and then just move to another one. If a person changes his ideology everyday or is fixed in “correct one” form 12 years old until 80, I find it less mentally healthy.
The same way I more prefer an AI which goals are evolving in millions of years – to the AI which is evolving in seconds or is fixed forever.
Human values evolve in human ways. A priori, an AI’s value drift would almost surely take it in alien, worthless-to-us directions. A non-evolving AI sounds easier to align—we only need to hit the human-aligned region of valuespace once instead of needing to keep hitting it.
It is normal for human values to evolve. If my values were fixed at me at 6 years old, I would be regarded mentally ill.
However, there are normal human speed and directions of value evolution, and there are some ways of value evolution which could be regarded as too quick, too slow, or going in a strange direction. In other words, the speed and direction of the value drift is a normative assumption. For example, i find normal that a person is fascinated with some philosophical system for years and then just move to another one. If a person changes his ideology everyday or is fixed in “correct one” form 12 years old until 80, I find it less mentally healthy.
The same way I more prefer an AI which goals are evolving in millions of years – to the AI which is evolving in seconds or is fixed forever.
Human values evolve in human ways. A priori, an AI’s value drift would almost surely take it in alien, worthless-to-us directions. A non-evolving AI sounds easier to align—we only need to hit the human-aligned region of valuespace once instead of needing to keep hitting it.