Maybe alignment is easy, but someone misuses AI, say to create an AI assisted dictatorship
Maybe we try really hard and we can align AI to whatever we want, but we make a bad choice and lock-in current day values, or we make a bad choice about reflection procedure that gives us much less than the ideal value of the universe.
I want to focus on these two, since even in an AI Alignment success stories, these can still happen, and thus it doesn’t count as an AI Alignment failure.
For B, misused is relative to someone’s values, which I want to note a bit here.
For C, I view the idea of a “bad value” or “bad reflection procedures to values”, without asking the question “relative to what and whose values?” a type error, and thus it’s not sensible to talk about bad values/bad reflection procedures in isolation.
I want to focus on these two, since even in an AI Alignment success stories, these can still happen, and thus it doesn’t count as an AI Alignment failure.
For B, misused is relative to someone’s values, which I want to note a bit here.
For C, I view the idea of a “bad value” or “bad reflection procedures to values”, without asking the question “relative to what and whose values?” a type error, and thus it’s not sensible to talk about bad values/bad reflection procedures in isolation.