In this post sequence, I will introduce the “Value Change Problem” (VCP), argue why I think it’s critical to getting ethical AI design right, and outline what risks we might expect if we fail to properly address it.
The core claim can be summarised as follows:
AI alignment must address the problem of (il)legitimate value change; that is, the problem of making sure AI systems neither cause value change illegitimately, nor forestall legitimate cases of value change in humans and society.
This case for VCP resides on three core premises:
human values are malleable (post 1)
some instances of value change seem (un)problematic (post 2)
AI systems are (and will become increasingly) capable of affecting people’s value change trajectories (post 3)
I then discussed two ways in which this risk can manifest:
by causing illegitimate cases of value change (upcoming: post 4), and
by preventing legitimate cases of value change (upcoming: post 5).
For each of these I want to ask: What is the risk? What are plausible mechanisms by which these risks manifest? What are ways in which these risks manifest already, and what are the ways in which they are likely to be exacerbated going forward, as AI systems become more advanced and more widely deployed? In particular, I introduce the notions of ‘performative power’ and ‘value collapse’ as critical phenomena involved in the manifestation of the two risks respectively.
The different posts are more or less self-contained and can be read on their own.
In particular, if you you already sufficiently convinced by the core case for VCP, you may want to directly skip to reading the discussion of risks in posts 4 and 5.