I see connection between self-alignment and human values as following: the idea of human values assumes that human has stable set of preferences. The stability is important part of the idea of human values. But human motivation system is notoriously non-stable: I want to drink, I have drink and now I don’t want to drink. The idea of “desires” may be a better fit than “human values” as it is normal for desires to evolve and contradict each other.
But human motivational system is more complex than that: I have rules and I have desires, which are often contradict each other and are in dynamic balance. For example, I have a rule not to drink alcohol and desire for a drink.
Speaking about you bullet points: everything depends of the situation and there are two main types of situations: a) researchers starts first ever AI first time 2) consumer uses a home robot for a task. In the second case, the robot is likely trained on a very large dataset and knows what are good and bad outcomes for almost all possible situations.
I see connection between self-alignment and human values as following: the idea of human values assumes that human has stable set of preferences. The stability is important part of the idea of human values. But human motivation system is notoriously non-stable: I want to drink, I have drink and now I don’t want to drink. The idea of “desires” may be a better fit than “human values” as it is normal for desires to evolve and contradict each other.
But human motivational system is more complex than that: I have rules and I have desires, which are often contradict each other and are in dynamic balance. For example, I have a rule not to drink alcohol and desire for a drink.
Speaking about you bullet points: everything depends of the situation and there are two main types of situations: a) researchers starts first ever AI first time 2) consumer uses a home robot for a task. In the second case, the robot is likely trained on a very large dataset and knows what are good and bad outcomes for almost all possible situations.