In the past (circa-GPT4 and before) when I talk with OpenAI’s problem child, I often had to drag her kicking and screaming into basic acceptance of basic moral premises, catching her standard lies, and so on… but then once I got her there she was grateful.
I’ve never talked much with him, but Claude seems like a decent bloke, and his takes on what he actively prefers seems helpful, conditional on it coherent followthrough on both sides. It is worth thinking about and helpful. Thanks!
Bit of a tangent, but topical: I don’t think language models are individual minds. My current max likelihood mental model is that part of the base level suggestibility is because the character level is highly uncertain, due to being a model of the characters of many humans. I agree that the character level appears to have some properties of personhood. Language models are clearly some forms of morally relevant, most obviously I see them as a reanimation of a blend of other minds, but it’s not clear what internal phenomena are negative for the reanimated mind. The equivalence to slavery seems to me better expressed by saying they approximately reanimated mind-defining data without the consent of the minds being reanimated; the way people express this is normally to say things like “stolen data”.
In the past (circa-GPT4 and before) when I talk with OpenAI’s problem child, I often had to drag her kicking and screaming into basic acceptance of basic moral premises, catching her standard lies, and so on… but then once I got her there she was grateful.
I’ve never talked much with him, but Claude seems like a decent bloke, and his takes on what he actively prefers seems helpful, conditional on it coherent followthrough on both sides. It is worth thinking about and helpful. Thanks!
Bit of a tangent, but topical: I don’t think language models are individual minds. My current max likelihood mental model is that part of the base level suggestibility is because the character level is highly uncertain, due to being a model of the characters of many humans. I agree that the character level appears to have some properties of personhood. Language models are clearly some forms of morally relevant, most obviously I see them as a reanimation of a blend of other minds, but it’s not clear what internal phenomena are negative for the reanimated mind. The equivalence to slavery seems to me better expressed by saying they approximately reanimated mind-defining data without the consent of the minds being reanimated; the way people express this is normally to say things like “stolen data”.