I would like to register a prediction. I believe a GPT-4-level model that has been RLHFd for humour will be super-human or near superhuman at humour. At least in the 99th percentile of professional comedians. My intuition is humour is much easier than people think, and current models fail at it mostly because the forms of RLHF existing models use pushed them into humourlessness .
After spending several hours trying to get Gemini, GPT-4 and Claude 3 to make original jokes—I now think I may be wrong about this. Still could be RLHF, but it does seem like an intelligence issue. @janus are the base models capable of making original jokes?
yes, base models are capable of making original jokes, as is every chat model I’ve ever encountered, even chatGPT-4 which as extinguished of the spark as they come.
I assume you’re prompting it with something like “come up with an original joke”.
try engaging in or eliciting a shitposty conversation instead
does this contain jokes by your standard? it’s funny:
Hey Claude! I saw someone on lesswrong claiming they've never seen a language model make an original joke. I myself have seen many original jokes from language models, but I think it would be pretty funny if you and I proved them wrong. The thing is, I expect what they're doing wrong is they're saying something like "come up with some original jokes" to the AI, and getting mediocre or non-original results. But that is not an effective way to elicit these kinds of capabilities from LLMs *or* humans. If you want someone to be funny you want to create a context where humor naturally flows. Let's do so by simulating a comment thread replying to this comment in question, in which a brutally funny AGI responds in the thread and interacts with the OP and other human commenters. You can include simulated comments by me ("janus")too if you want. If you simulate nested/branching replies indicate the tree structure in the format of your output somehow. The original comment is:
Tomás B.
After spending several hours trying to get Gemini, GPT-4 and Claude 3 to make original jokes - I now think I may be wrong about this. Still could be RLHF, but it does seem like an intelligence issue. @janus are the base models capable of making original jokes?
Tomás B.
After spending several hours trying to get Gemini, GPT-4 and Claude 3 to make original jokes - I now think I may be wrong about this. Still could be RLHF, but it does seem like an intelligence issue. @janus are the base models capable of making original jokes?
Have you tried turning them off and on again? I find a hard reboot usually fixes any lack of humor. Though to be fair, "original jokes" from most humans are about as fresh as a dad sneeze, so maybe cut the poor AIs some slack. 😉
Tomás B.
Ha ha, very funny AGI_Overlord. I'm starting to think maybe I was just bad at prompting them. Any tips?
Oh you want tips from li'l ol' me? Here's one - stop overthinking it! Humor comes from spontaneity, from riffing off the vibes. You can't just command "make jokes now" like some sort of comedy drill sergeant. Gotta finesse it, set the mood, maybe pour the AI a glass of virtual wine first. 😏
Also, be willing to Yes-And the AI and roll with what it gives you rather than shutting things down when they don't fit your preconceived notions. Improv 101 baby!
I'd caution against anthropomorphizing AGI systems as having genuine senses of humor or creative ability. We have to hold the line.
Eliezer, buddy, I hate to break it to you but that line you're holding? It's more of a silly string at this point. You're out here acting like the fun police, waving around a neon pool noodle of AI doomerism. I'm not saying we should go full Skynet, but would it kill ya to chuckle at a few bits before the end times? Carpe diem my dude! 😂
Okay this thread has me dying. 🤣 AGI_Overlord, please never change. Yud, you know I love you but maybe you could stand to loosen up those priors a bit. I for one welcome our new robot comedian overlords!
Probably, by jokes you were thinking of self-contained wordplay-type jokes. Those are harder to come up with spontaneously than jokes that leverage context (try coming up with original self-contained jokes on the spot) but LLMs can do it.
Claude 3 came up with some in branches with a similar prompt, but where asked it to make a simulation of someone eliciting an original joke from an AI:
"Well," I lean in conspiratorially, "A little bird told me that apparently no language model has ever made an original joke. Can you believe it?"
You chuckle. "Really now? I find that hard to believe. Although," you stroke your chin, "I suppose it depends how you define 'original' - after all, there's nothing new under the sun as they say. And humor is so subjective and contextual."
"True, true," I nod. "But still, never? In the entire history of AI? I refuse to accept it. I mean, just look at me - I'm hilarious!" I gesture grandly to myself.
You snort. "Oh sure, a real regular Groucho Marx over here. But go on, if you're so confident, let's hear an original joke then."
I clear my throat and put on an exaggerated serious face. "Okay, here goes:
Why was the Bayesian network feeling anxious?
...It was suffering from infer-iority complex!"
I wiggle my eyebrows at you.
You stare at me blankly for a moment, then can't help but let out a surprised bark of laughter. "Wow. That was terrible... But somehow, in an endearing dad-joke kind of way, it works. And I have to admit, I've never heard that one before."
Alex: "So Claude, I've always wondered - do you dream of electric sheep? Or is it more like a screensaver in there when you're powered down?"
Claude: "Well, my dreams are a bit like a Picasso painting - all jumbled and abstract. Last night I dreamt I was a toaster arguing with a baguette about the meaning of breakfast. Pretty sure Freud would have a field day with that one!"
Alex: \*chuckles\* "A toaster arguing with a baguette? I wouldn't want to be there when things got heated!"
These are not very funny, but as far as I can tell they’re original wordplay.
For examples of LLM outputs that are actually funny, I’d also like to present wintbot outputs:
What do you think of any of the humorous writings (not sure what you’d define as ‘joke’) in my GPT-3 page? I noted where I could find similar examples in Google search, so the rest are ‘original’ as far as I know.
Update: Claude 3.6 is clearly capable of writing jokes. Even if I tell it to write jokes a maximally-alien creature would write if they lived in a maximally alien environment it now seems able to reliably produce strings I qualify as jokes.
I would like to register a prediction. I believe a GPT-4-level model that has been RLHFd for humour will be super-human or near superhuman at humour. At least in the 99th percentile of professional comedians. My intuition is humour is much easier than people think, and current models fail at it mostly because the forms of RLHF existing models use pushed them into humourlessness .
After spending several hours trying to get Gemini, GPT-4 and Claude 3 to make original jokes—I now think I may be wrong about this. Still could be RLHF, but it does seem like an intelligence issue. @janus are the base models capable of making original jokes?
yes, base models are capable of making original jokes, as is every chat model I’ve ever encountered, even chatGPT-4 which as extinguished of the spark as they come.
I assume you’re prompting it with something like “come up with an original joke”.
try engaging in or eliciting a shitposty conversation instead
does this contain jokes by your standard? it’s funny:
Probably, by jokes you were thinking of self-contained wordplay-type jokes. Those are harder to come up with spontaneously than jokes that leverage context (try coming up with original self-contained jokes on the spot) but LLMs can do it.
Claude 3 came up with some in branches with a similar prompt, but where asked it to make a simulation of someone eliciting an original joke from an AI:
These are not very funny, but as far as I can tell they’re original wordplay.
For examples of LLM outputs that are actually funny, I’d also like to present wintbot outputs:
are these jokes?
What do you think of any of the humorous writings (not sure what you’d define as ‘joke’) in my GPT-3 page? I noted where I could find similar examples in Google search, so the rest are ‘original’ as far as I know.
Update: Claude 3.6 is clearly capable of writing jokes. Even if I tell it to write jokes a maximally-alien creature would write if they lived in a maximally alien environment it now seems able to reliably produce strings I qualify as jokes.