gwern comments on Categorical Organization in Memory: ChatGPT Organizes the 665 Topic Tags from My New Savanna Blog

gwern 17 Dec 2023 1:18 UTC
3 points
1
You could do it by embedding the text of each post, and then averaging all the embeddings of each tag’s posts into a single ‘tag embedding’, which summarizes the gestalt of all the posts with a given tag. Then you could do my sort trick, or just use a standard clustering algorithm to cluster the tags into 12 clusters, and ask GPT to label each cluster using the list of titles, say.

This would address your points about GPT being unable to ‘plan’ or being misled by idiosyncratic uses of words like ‘jasmine’. It would also produce a much more even distribution over the 12 clusters, unless there truly was an extremely skewed distribution (as well as the other advantages I mentioned like not forgetting or skipping any entries or confusing item counts or whatever).
- Bill Benzon 17 Dec 2023 10:12 UTC
  1 point
  0
  Parent
  Interesting, yes. Sure. But keep in mind that what I was up to in that paper is much simpler. I wasn’t really interested in organizing my tag list. That’s just a long list that I had available to me. I just wanted to see how ChatGPT would deal with the task of coming up with organizing categories. Could it do it at all? If so, would its suggestions be reasonable ones? Further, since I didn’t know what it would do, I decided to start first with a shorter list. It was only when I’d determined that it could do the task in a reasonable way with the shorter lists that I threw the longer list at it.
  What I’ve been up to is coming up with tasks where ChatGPT’s performance gives me clues as to what’s going on internally. Whereas the mechanistic interpretability folks are reverse engineering from the bottom up, I’m working from the top down. Now, in doing this, I’ve already got some ideas about semantics is structured in the brain; that is, I’ve got some ideas about the device that produces all those text strings. Not only that, but horror of horrors! Those ideas are based in ‘classical’ symbolic computing. But my particular set of ideas tells me that, yes, it makes sense that ANNs should be able to induce something that approximates what the brain is up to. So I’ve never for a minute thought the ‘stochastic parrots’ business was anything more than a rhetorical trick. I wrote that up after I’d worked with GPT-3 a little.
  At this point I’m reasonably convinced that in some ways, yes, what’s going on internally is like a classical symbolic net, but in other ways, no, it’s quite different. I reached that conclusion after working intensively on having ChatGPT generate simple stories. After thinking about that for awhile I decided that, no, something’s going on that’s quite different from a classical symbolic story grammar. But then, what humans do seems to me in some ways not like classical story grammars.
  It’s all very complicated and very interesting. In the last month of so I’ve started working with a machine vision researcher at Goethe University in Frankfurt (Visvanathan Ramesh). We’re slowly making progress.