gwern comments on Language Models Model Us

gwern 18 May 2024 20:22 UTC
7 points
2
Yes, I’ve never had any difficulty replicating the gwern identification: https://chatgpt.com/share/0638f916-2f75-4d15-8f85-7439b373c23c It also does Scott Alexander: https://chatgpt.com/share/298685e4-d680-43f9-81cb-b67de5305d53 https://chatgpt.com/share/91f6c5b8-a0a4-498c-a57b-8b2780bc1340 (Examples from sinity just today, but parallels all of the past ones I’ve done: sometimes it’ll balk a little at making a guess or identifying someone, but usually not hard to overcome.)

One interesting thing is that the extensive reasoning it gives may not be faithful. Notice that in identifying Scott Alexander’s recent Reddit comment, it gets his username wrong—that username does not exist at all. (I initially speculated that it was using retrieval since OA & Reddit have struck a deal; but obviously, if it had, or had been trained on the actual comment, it would at least get the username right.) And in my popups comment, I see no mention that points to LessWrong, but since I was lazy and didn’t copyedit that comment, it is much more idiosyncratic than usual; so what I think ChatGPT-4o does there is immediately deduce that it’s me from the writing style & content, infer that it could not be a tweet due to length or a Gwern.net quote because it is clearly a comment on social media responding to someone, and then guesses it’s LW rather than HN, and presto.