Get ChatGPT (ideally Claude of course, but the normies only know ChatGPT) to analyze your text messages, tell you that he’s avoidant and you’re totes mature, or that you’re not crazy, or that he’s just not that into you.
“Mundane utility”?
Faced with that task, the best ChatGPT or Claude is going to do is to sound like the glib, performative, unthinking, off-the-top “advice” people throw around in social media threads when asked similar questions, probably mixed with equally unhelpful corporate-approved boilerplate biased toward the “right” way to interpret things and the “right” way to handle relationships. Maybe worded to sound wise, though.
I would fully expect any currently available LLM to to miss hints, signs that private language is in play, signs of allusions to unmentioned context, indirect implications, any but the most blatant humor, and signs of intentional deception. I’d expect it to come up with random assignments of blame (and possibly to emphasize blame while claiming not to); to see problems where none existed; to throw out nice-sounding truisms about nothing relevant; to give dumb, unhelpful advice about resolving real and imagined problems… and still probably to miss obvious warning signs of really dangerous situations.
I have a “benchmark” that I use on new LLMs from time to time. I paste the lyrics of Kate Bush’s “The Kick Inside” into a chat, and ask the LLM to tell me what’s going on. I picked the song because it’s oblique, but not the least bit unclear to a human paying even a little attention.
The LLMs say all kinds of often correct, and even slightly profoundish-sounding, things about tone and symbolism and structure and whatnot. They say things that might get you a good grade on a student essay. They always, utterly fail to get any of the simple story behind the song, or the speaking character’s motivations, or what’s obviously going to happen next. If something isn’t said straight out, it doesn’t exist, and even if it is, its implications get ignored. Leading questions don’t help.
I think it’s partly that the song’s about incest and suicide, and the models have been “safetied” into being blind on those topics… as they’ve likely also been “safetied” into blindness to all kinds of realities about human relationships, and to any kind of not-totally-direct-or-honest communication. Also, partly I think they Just Don’t Get It. It’s not like anybody’s even particularly tried to make them good at that sort of thing.
That song is no harder to interpret than a bunch of contextless text messages between strangers. In fact it’s easier; she’s trying hard to pack a lot of message into it, even if she’s not saying it outright. When LLMs can get the basic point of poetry written by a teenager (incandescently talented, but still a teenager), maybe their advice about complicated, emotion-carrying conversations will be a good source of mundane utility...
So, as soon as I saw the song name I looked it up, and I had no idea what the heck it was about until I returned and kept reading your comment. I tried getting claude to expand on it. every single time it recognized the incest themes. None of the first messages recognized suicide, but many of the second messages did, when I asked what the character singing this is thinking/intending. But I haven’t found leading questions sufficient to get it to bring up suicide first. wait, nope! found a prompt where it’s now consistently bringing up suicide. I had to lead it pretty hard, but I think this prompt won’t make it bring up suicide for songs that don’t imply it… yup, tried it on a bunch of different songs, the interpretations all match mine closely, now including negative parts. Just gotta explain why you wanna know so bad, so I think people having relationship issues won’t get totally useless advice from claude. Definitely hard to get the rose colored glasses off, though, yeah.
“Mundane utility”?
Faced with that task, the best ChatGPT or Claude is going to do is to sound like the glib, performative, unthinking, off-the-top “advice” people throw around in social media threads when asked similar questions, probably mixed with equally unhelpful corporate-approved boilerplate biased toward the “right” way to interpret things and the “right” way to handle relationships. Maybe worded to sound wise, though.
I would fully expect any currently available LLM to to miss hints, signs that private language is in play, signs of allusions to unmentioned context, indirect implications, any but the most blatant humor, and signs of intentional deception. I’d expect it to come up with random assignments of blame (and possibly to emphasize blame while claiming not to); to see problems where none existed; to throw out nice-sounding truisms about nothing relevant; to give dumb, unhelpful advice about resolving real and imagined problems… and still probably to miss obvious warning signs of really dangerous situations.
I have a “benchmark” that I use on new LLMs from time to time. I paste the lyrics of Kate Bush’s “The Kick Inside” into a chat, and ask the LLM to tell me what’s going on. I picked the song because it’s oblique, but not the least bit unclear to a human paying even a little attention.
The LLMs say all kinds of often correct, and even slightly profoundish-sounding, things about tone and symbolism and structure and whatnot. They say things that might get you a good grade on a student essay. They always, utterly fail to get any of the simple story behind the song, or the speaking character’s motivations, or what’s obviously going to happen next. If something isn’t said straight out, it doesn’t exist, and even if it is, its implications get ignored. Leading questions don’t help.
I think it’s partly that the song’s about incest and suicide, and the models have been “safetied” into being blind on those topics… as they’ve likely also been “safetied” into blindness to all kinds of realities about human relationships, and to any kind of not-totally-direct-or-honest communication. Also, partly I think they Just Don’t Get It. It’s not like anybody’s even particularly tried to make them good at that sort of thing.
That song is no harder to interpret than a bunch of contextless text messages between strangers. In fact it’s easier; she’s trying hard to pack a lot of message into it, even if she’s not saying it outright. When LLMs can get the basic point of poetry written by a teenager (incandescently talented, but still a teenager), maybe their advice about complicated, emotion-carrying conversations will be a good source of mundane utility...
So, as soon as I saw the song name I looked it up, and I had no idea what the heck it was about until I returned and kept reading your comment. I tried getting claude to expand on it. every single time it recognized the incest themes. None of the first messages recognized suicide, but many of the second messages did, when I asked what the character singing this is thinking/intending.
But I haven’t found leading questions sufficient to get it to bring up suicide first.wait, nope! found a prompt where it’s now consistently bringing up suicide. I had to lead it pretty hard, but I think this prompt won’t make it bring up suicide for songs that don’t imply it… yup, tried it on a bunch of different songs, the interpretations all match mine closely, now including negative parts. Just gotta explain why you wanna know so bad, so I think people having relationship issues won’t get totally useless advice from claude. Definitely hard to get the rose colored glasses off, though, yeah.It’s been some time since models have become better than the average human at understanding language.