Somewhat of a tangent—I realized I don’t really understand the basic reasoning behind current efforts on “trying to make AI stop saying naughty words”. Like, what’s the actual problem with an LLM producing racist or otherwise offensive content that warrants so much effort? Why don’t the researchers just slap an M content label on the model and be done with it? Movie characters say naughty words all the time, are racist all the time, disembody other people in ingenious and sometimes realistic ways, and nobody cares—so what’s the difference?..
Somewhat of a tangent—I realized I don’t really understand the basic reasoning behind current efforts on “trying to make AI stop saying naughty words”. Like, what’s the actual problem with an LLM producing racist or otherwise offensive content that warrants so much effort? Why don’t the researchers just slap an M content label on the model and be done with it? Movie characters say naughty words all the time, are racist all the time, disembody other people in ingenious and sometimes realistic ways, and nobody cares—so what’s the difference?..