I don’t know quite how to respond to that. Without having read the piece that took me, I don’t know, say 30-40 hours to write spread over two or three weeks (including the hour or so I spent with ChatGPT), you’re telling me that it couldn’t possibly be worth more than a tweet. How do you know that? Have you thought about what the task involves? If you had a list of 50 topics to organize, how would you do it manually? What about 655 topics? How would you do that manually?
How would you do it using a computer? Sure, given well defined items and clear sort criteria, computers do that kind of thing all the time, and over humongous collections of items. It’s a staple process of programming. But these items are not at all well-defined and the sort criteria, well, ChatGPT has to figure that out for itself.
If I was going to sort a list of 655 topics into a linear order and I didn’t have a well-defined hierarchy or pre-existing list to work from, I might use one of two approaches:
for manual sorting along a single axis, I can probably not give any sort of cardinal value but I can do comparisons of the form ‘A is more/less X than is B’. Then I can use my resorter utility to lighten the burden of an obvious approach like trying to herd them all in a spreadsheet or text buffer.
if I prefer to automate it, I can embed them (presumably they have text descriptions or titles, or even abstracts if they are things like papers or URLs) with a neural net and then I can ‘sort them’ by simply picking one to start with, and then finding the ‘nearest’ by embedding, and repeating until they are all in a giant list. I call this ‘sort by magic’ or ‘sorting by semantic similarity’.
(Note that this embedding approach, while a lot more work up front than simply tossing a list into a ChatGPT text book, has many advantages beyond just producing the list: it avoids any issues with GPT-4 hallucinating, forgetting, being very expensive to call on long lists, lists not fitting in context, the API being down, etc.)
This produces some interesting effects: because such lists have contents that naturally cluster, reading through the sorted list will often reveal ‘obvious’ clusters as the list transitions from cluster to cluster. There is no a priori way to decide how many clusters there ‘actually’ are, but I found that roughly, k = sqrt(n) worked well to pick out a reasonably evenly populated & meaningful set of clusters.
Once you have defined k and they are picked out like that, it’s easy to grab a cluster and make it a sublist, for example, and to give it a name. (In fact, I even have a feature where I feed a cluster into GPT-4 as a list, and ask it for a descriptive name.) Or you can start editing it to fix up problems, or you can specify where to start to get a better list, or you can feed it into #1 as a starting point.
So for example, my psychology/smell tag broke down nicely into a ‘perfume’, a ‘human’, and an ‘animal’ tag. I created perfume & human as more precise sub-tags, and left the animal links alone.
I don’t know what these mean: “sort a list of 655 topics into a linear order,” “sorting along a single axis.” The lists I’m talking about are already in alphabetical order. The idea is to come up with a set of categories which you can use to organize the list in thematically coherent sub lists. It’s like you have a library of 1000 books. How are you going to put them on shelves? You could group them alphabetically by title or author’s (last) name. Or you could group them by subject matter. In doing this you know what the subjects are have a sense of what things you’d like to see in the same shelves. This is what you call ‘sorting by semantic similarity.’
The abstract of the paper explains what I was up to. But I wasn’t using books; I was using unadorned lists of categories. When I started I didn’t know what ChatGPT would do when given a list for which it had to come up with organizing categories. I know how I used those labels, but it knows nothing of that. So I gave it a try and found out what it could do. Things got interesting when I asked it to go beyond coming up with organizing categories and to actually sort list items into those categories.
You could do it by embedding the text of each post, and then averaging all the embeddings of each tag’s posts into a single ‘tag embedding’, which summarizes the gestalt of all the posts with a given tag. Then you could do my sort trick, or just use a standard clustering algorithm to cluster the tags into 12 clusters, and ask GPT to label each cluster using the list of titles, say.
This would address your points about GPT being unable to ‘plan’ or being misled by idiosyncratic uses of words like ‘jasmine’. It would also produce a much more even distribution over the 12 clusters, unless there truly was an extremely skewed distribution (as well as the other advantages I mentioned like not forgetting or skipping any entries or confusing item counts or whatever).
Interesting, yes. Sure. But keep in mind that what I was up to in that paper is much simpler. I wasn’t really interested in organizing my tag list. That’s just a long list that I had available to me. I just wanted to see how ChatGPT would deal with the task of coming up with organizing categories. Could it do it at all? If so, would its suggestions be reasonable ones? Further, since I didn’t know what it would do, I decided to start first with a shorter list. It was only when I’d determined that it could do the task in a reasonable way with the shorter lists that I threw the longer list at it.
What I’ve been up to is coming up with tasks where ChatGPT’s performance gives me clues as to what’s going on internally. Whereas the mechanistic interpretability folks are reverse engineering from the bottom up, I’m working from the top down. Now, in doing this, I’ve already got some ideas about semantics is structured in the brain; that is, I’ve got some ideas about the device that produces all those text strings. Not only that, but horror of horrors! Those ideas are based in ‘classical’ symbolic computing. But my particular set of ideas tells me that, yes, it makes sense that ANNs should be able to induce something that approximates what the brain is up to. So I’ve never for a minute thought the ‘stochastic parrots’ business was anything more than a rhetorical trick. I wrote that up after I’d worked with GPT-3 a little.
At this point I’m reasonably convinced that in some ways, yes, what’s going on internally is like a classical symbolic net, but in other ways, no, it’s quite different. I reached that conclusion after working intensively on having ChatGPT generate simple stories. After thinking about that for awhile I decided that, no, something’s going on that’s quite different from a classical symbolic story grammar. But then, what humans do seems to me in some ways not like classical story grammars.
It’s all very complicated and very interesting. In the last month of so I’ve started working with a machine vision researcher at Goethe University in Frankfurt (Visvanathan Ramesh). We’re slowly making progress.
I don’t know quite how to respond to that. Without having read the piece that took me, I don’t know, say 30-40 hours to write spread over two or three weeks (including the hour or so I spent with ChatGPT), you’re telling me that it couldn’t possibly be worth more than a tweet. How do you know that? Have you thought about what the task involves? If you had a list of 50 topics to organize, how would you do it manually? What about 655 topics? How would you do that manually?
How would you do it using a computer? Sure, given well defined items and clear sort criteria, computers do that kind of thing all the time, and over humongous collections of items. It’s a staple process of programming. But these items are not at all well-defined and the sort criteria, well, ChatGPT has to figure that out for itself.
Your’s is not a serious comment.
If I was going to sort a list of 655 topics into a linear order and I didn’t have a well-defined hierarchy or pre-existing list to work from, I might use one of two approaches:
for manual sorting along a single axis, I can probably not give any sort of cardinal value but I can do comparisons of the form ‘A is more/less X than is B’. Then I can use my resorter utility to lighten the burden of an obvious approach like trying to herd them all in a spreadsheet or text buffer.
if I prefer to automate it, I can embed them (presumably they have text descriptions or titles, or even abstracts if they are things like papers or URLs) with a neural net and then I can ‘sort them’ by simply picking one to start with, and then finding the ‘nearest’ by embedding, and repeating until they are all in a giant list. I call this ‘sort by magic’ or ‘sorting by semantic similarity’.
(Note that this embedding approach, while a lot more work up front than simply tossing a list into a ChatGPT text book, has many advantages beyond just producing the list: it avoids any issues with GPT-4 hallucinating, forgetting, being very expensive to call on long lists, lists not fitting in context, the API being down, etc.)
This produces some interesting effects: because such lists have contents that naturally cluster, reading through the sorted list will often reveal ‘obvious’ clusters as the list transitions from cluster to cluster. There is no a priori way to decide how many clusters there ‘actually’ are, but I found that roughly, k = sqrt(n) worked well to pick out a reasonably evenly populated & meaningful set of clusters.
Once you have defined k and they are picked out like that, it’s easy to grab a cluster and make it a sublist, for example, and to give it a name. (In fact, I even have a feature where I feed a cluster into GPT-4 as a list, and ask it for a descriptive name.) Or you can start editing it to fix up problems, or you can specify where to start to get a better list, or you can feed it into #1 as a starting point.
So for example, my
psychology/smell
tag broke down nicely into a ‘perfume’, a ‘human’, and an ‘animal’ tag. I created perfume & human as more precise sub-tags, and left the animal links alone.I don’t know what these mean: “sort a list of 655 topics into a linear order,” “sorting along a single axis.” The lists I’m talking about are already in alphabetical order. The idea is to come up with a set of categories which you can use to organize the list in thematically coherent sub lists. It’s like you have a library of 1000 books. How are you going to put them on shelves? You could group them alphabetically by title or author’s (last) name. Or you could group them by subject matter. In doing this you know what the subjects are have a sense of what things you’d like to see in the same shelves. This is what you call ‘sorting by semantic similarity.’
The abstract of the paper explains what I was up to. But I wasn’t using books; I was using unadorned lists of categories. When I started I didn’t know what ChatGPT would do when given a list for which it had to come up with organizing categories. I know how I used those labels, but it knows nothing of that. So I gave it a try and found out what it could do. Things got interesting when I asked it to go beyond coming up with organizing categories and to actually sort list items into those categories.
I’ve also played around with having ChatGPT respond to clusters of words.
You could do it by embedding the text of each post, and then averaging all the embeddings of each tag’s posts into a single ‘tag embedding’, which summarizes the gestalt of all the posts with a given tag. Then you could do my sort trick, or just use a standard clustering algorithm to cluster the tags into 12 clusters, and ask GPT to label each cluster using the list of titles, say.
This would address your points about GPT being unable to ‘plan’ or being misled by idiosyncratic uses of words like ‘jasmine’. It would also produce a much more even distribution over the 12 clusters, unless there truly was an extremely skewed distribution (as well as the other advantages I mentioned like not forgetting or skipping any entries or confusing item counts or whatever).
Interesting, yes. Sure. But keep in mind that what I was up to in that paper is much simpler. I wasn’t really interested in organizing my tag list. That’s just a long list that I had available to me. I just wanted to see how ChatGPT would deal with the task of coming up with organizing categories. Could it do it at all? If so, would its suggestions be reasonable ones? Further, since I didn’t know what it would do, I decided to start first with a shorter list. It was only when I’d determined that it could do the task in a reasonable way with the shorter lists that I threw the longer list at it.
What I’ve been up to is coming up with tasks where ChatGPT’s performance gives me clues as to what’s going on internally. Whereas the mechanistic interpretability folks are reverse engineering from the bottom up, I’m working from the top down. Now, in doing this, I’ve already got some ideas about semantics is structured in the brain; that is, I’ve got some ideas about the device that produces all those text strings. Not only that, but horror of horrors! Those ideas are based in ‘classical’ symbolic computing. But my particular set of ideas tells me that, yes, it makes sense that ANNs should be able to induce something that approximates what the brain is up to. So I’ve never for a minute thought the ‘stochastic parrots’ business was anything more than a rhetorical trick. I wrote that up after I’d worked with GPT-3 a little.
At this point I’m reasonably convinced that in some ways, yes, what’s going on internally is like a classical symbolic net, but in other ways, no, it’s quite different. I reached that conclusion after working intensively on having ChatGPT generate simple stories. After thinking about that for awhile I decided that, no, something’s going on that’s quite different from a classical symbolic story grammar. But then, what humans do seems to me in some ways not like classical story grammars.
It’s all very complicated and very interesting. In the last month of so I’ve started working with a machine vision researcher at Goethe University in Frankfurt (Visvanathan Ramesh). We’re slowly making progress.