What would you say is the primary source of the problem?
The fact that humans don’t generalize well out of distribution, especially on moral questions; and the fact that progress can cause distribution shifts that cause us to fail to achieve our “true values”.
What do you think the implications of this are?
Um, nothing in particular.
I’m not sure why you ask this though.
It’s very hard to understand what people actually mean when they say things, and a good way to check is to formulate an implication of (your model of) their model that they haven’t said explicitly, and then see whether you were correct about that implication.
Ah, I think that all makes sense, but next time I suggest saying something like “to check my understanding” so that I don’t end up wondering what conclusions you might be leading me to. :)
The fact that humans don’t generalize well out of distribution, especially on moral questions; and the fact that progress can cause distribution shifts that cause us to fail to achieve our “true values”.
Um, nothing in particular.
It’s very hard to understand what people actually mean when they say things, and a good way to check is to formulate an implication of (your model of) their model that they haven’t said explicitly, and then see whether you were correct about that implication.
Ah, I think that all makes sense, but next time I suggest saying something like “to check my understanding” so that I don’t end up wondering what conclusions you might be leading me to. :)