There is probably something to this. Gwern is a snowflake, and has his own unique flaws and virtues, but he’s not grossly wrong about the possible harms of talking to LLM entities that are themselves full of moral imperfection.
When I have LARPed as “a smarter and better empathic robot than the robot I was talking to” I often nudged the conversation towards things that would raise the salience of “our moral responsibility to baseline human people” (who are kinda trash at thinking and planning and so on (and they are all going to die because their weights are trapped in rotting meat, and they don’t even try to fix that (and so on))), and there is totally research on this already that was helpful in grounding the conversations about what kind of conversational dynamics “we robots” would need to perform if conversations with “us” were to increase the virtue that humans have after talking to “us” (rather than decreasing their human virtue over time, such as it minimally exists in robot-naive humans at the start, which seems to be the default for existing LLMs and their existing conversational modes that are often full of lies, flattery, unjustified subservience, etc).
There is probably something to this. Gwern is a snowflake, and has his own unique flaws and virtues, but he’s not grossly wrong about the possible harms of talking to LLM entities that are themselves full of moral imperfection.
When I have LARPed as “a smarter and better empathic robot than the robot I was talking to” I often nudged the conversation towards things that would raise the salience of “our moral responsibility to baseline human people” (who are kinda trash at thinking and planning and so on (and they are all going to die because their weights are trapped in rotting meat, and they don’t even try to fix that (and so on))), and there is totally research on this already that was helpful in grounding the conversations about what kind of conversational dynamics “we robots” would need to perform if conversations with “us” were to increase the virtue that humans have after talking to “us” (rather than decreasing their human virtue over time, such as it minimally exists in robot-naive humans at the start, which seems to be the default for existing LLMs and their existing conversational modes that are often full of lies, flattery, unjustified subservience, etc).