My view writ moral reflection leading to things we perceive as bad I suspect ultimately comes down to the fact that there are too many valid answers to the question “What’s moral/ethical?” or “What’s the CEV?” Indeed, I think there are an infinite number of valid answers to these questions.
This leads to several issues for alignment:
Your endpoint in reflection completely depends on your starting assumptions, and these assumptions are choosable.
There is no safeguard against someone reflecting and ending up in a point where they harm someone else’s values. Thus, seemingly bad values from our perspective can’t be guaranteed to be avoided.
The endpoints aren’t constrained by default, thus you have to hope that the reflection process doesn’t lead to your values being lessened or violated.
My view writ moral reflection leading to things we perceive as bad I suspect ultimately comes down to the fact that there are too many valid answers to the question “What’s moral/ethical?” or “What’s the CEV?” Indeed, I think there are an infinite number of valid answers to these questions.
This leads to several issues for alignment:
Your endpoint in reflection completely depends on your starting assumptions, and these assumptions are choosable.
There is no safeguard against someone reflecting and ending up in a point where they harm someone else’s values. Thus, seemingly bad values from our perspective can’t be guaranteed to be avoided.
The endpoints aren’t constrained by default, thus you have to hope that the reflection process doesn’t lead to your values being lessened or violated.