Wei Dai comments on AI Alignment Open Thread October 2019

Wei Dai 10 Dec 2019 5:07 UTC
LW: 13 AF: 5
AF
Human values can change a lot over a short amount of time, to the extent that maybe the commonly used “value drift” is not a good term to describe it. After reading Geoffrey Miller, my current model is that a big chunk of our apparent values comes from the need to do virtue signaling. In other words, we have certain values because it’s a lot easier to signal having those values when you really do have them. But the optimal virtues/values to signal can change quickly due to positive and negative feedback loops in the social dynamics around virtual signaling and for other reasons (which I don’t fully understand), which in turn causes many people’s values to quickly change in response, and moreover causes the next generation (whose values are more malleable) to adopt values different from their parents.

I don’t yet know the implications of this for AI alignment, but it seems like an important insight to share before I forget.