The role of philosophical thinking in understanding large language models: Calibrating and closing the gap between first-person experience and underlying mechanisms

Cross-posted from New Savanna.

Let’s start with some conception of what philosophy is. On that I like an essay by Peter Godfrey-Smith, a philosopher of science with a particular interest in biology, “On the Relation Between Philosophy and Science” (which I found via Dan Dennett). Godfrey-Smith offers three roles: 1) intellectual integration, 2) conceptual incubation, and 3) critical-thinking skills. He regards the first as fundamental and as the most important of the three. I agree.

Here’s his basic statement of that role:

The best one-sentence account of what philosophy is up to was given by Wilfrid Sellars in 1963: philosophy is concerned with “how things in the broadest possible sense of the term hang together in the broadest possible sense of the term.” Philosophy aims at an overall picture of what the world is like and how we fit into it.

A lot of people say they like the Sellars formulation but do not really take it on board. It expresses a view of philosophy in which the field is not self-contained, and makes extensive contact with what goes on outside it. That contact is inevitable if we want to work out how the picture of our minds we get from first-person experience relates to the picture in scientific psychology, how the biological world relates to the physical sciences, how moral judgments relate to our factual knowledge. Philosophy can make contact with other fields without being swallowed up by them, though, and it makes this contact while keeping an eye on philosophy’s distinctive role, which I will call an integrative role.

Note the sentence which I’ve put in highlighted. There are, of course, many different accounts one might give of the relationship between first-person experience and scientific psychology and Godfrey-Smith plays no favorites in this paper; he doesn’t even discuss that particular issue. But he recognizes that first-person experience must be honored, and that’s an important recognition.

Chatbots and us

In the current case, philosophy’s problem is to bridge the gap between our first-person experience of LLM-powered Chatbots, such as ChatGPT, and the process that is actually taking place inside the computer. Our first—person experience is that is that ChatGPT produces fluent prose on just about any topic you suggest. It may “hallucinate” as well, but the hallucinated text is fluent and indistinguishable from factual text unless you are familiar with the subject. How does ChatGPT do that? Alas, no one really knows. There is no detailed technical account of the process which the philosopher, or someone offering an integrating account – for many spend time doing that though they are not full-time professional philosophers, can bring within range of common-sense understanding by whatever means prove useful.

Many thinkers are assuring us that, no, these chatbots can’t think, they don’t understand, and they’re not conscious, and here’s why, sorta’. Of course, others are trying to convince us that they really are thinking, and/​or understanding, and/​or are conscious. The latter group has a much easier time of it, though, because humans are the only creatures capable of such fluid language production, and we know that humans can think, understand, and are conscious. These thinkers don’t have a deeper understanding chatbot behavior than the skeptics do, nor does either group understand how humans do those things. But the skeptics have to come up with something to fill the gap between first-person experience while the non-skeptics have no gap to fill: “Don’t worry, it is what you think it is, nothing to see here.” So, let’s set the non-skeptics aside. It’s the skeptics I want to think about.

Skeptics may utter phrases like, “stochastic parrots” and “autocomplete on steroids.” They don’t tell you much, especially if “stochastic” is at the outer edge of your vocabular and you don’t know how autocomplete works either, but they have a technical ring to go along with their dismissive content. All they do is assure us that it’s not what it seems to be without giving us much insight into why.

Beyond stochastic parrots

Let’s look at some examples from Murray Shanahan. He’s not a philosopher; he’s a senior scientist at DeepMind and on the faculty of Imperial College of London. He’s not a professional philosopher, but he’s performing the integrative role in a recent article, Talking about Large Language Models, published in Communications of the ACM (Association for Computing Machinery). The article is not particularly technical, but CACM is directed at an audience of computer professionals and assumes some sophistication. The first page of the article has a small section labeled “key insights”:

  • As LLMs become more powerful, it becomes increasingly tempting to describe LLM-based dialog agents in human-like terms, which can lead users to overestimate (or underestimate) their capabilities. To mitigate this, it is a good idea to foreground the objective they are trained on, which is next-token prediction.

  • We should be cautious when using words like “believes” in the context of LLMs. Ordinarily, this concept applies to agents that engage in embodied interaction with the world, allowing beliefs to be measured against external reality. Barebones LLMs are not “true believers.”

  • The concept of belief becomes increasingly applicable when LLMs are embedded in more complex systems, especially if those systems use “tools,” are multi-modal, or are embodied through robotics.

Those points clearly indicate that the purpose of the article is integrative. Shanahan is concerned about the gap between what LLMs actually do and the implications of the anthropomorphic language often used in discussing them.

Let’s consider only his first point, next-token production, which has been a constant theme in these kind of discussions for a couple of years. I’ve spent a great deal of time attempting to reconcile the gap between my own experience of ChatGPT and the idea that they’re just doing next-token prediction. I posted a longish piece on that theme on February 19, 2023, The idea that ChatGPT is simply “predicting” the next word is, at best, misleading. I cross-posted that at LessWrong, where it generated a long and very useful discussion.

Shanahan explains that LLMs

are generative because we can sample from them, which means we can ask them questions. But the questions are of the following, very specific kind: “Here’s a fragment of text. Tell me how this fragment might go on. According to your model of the statistics of human language, what words are likely to come next?”

Let’s look at three examples Shanahan uses:

The first person to walk on the Moon was

Twinkle, twinkle

After the ring was destroyed, Frodo Baggins returned to

The likely English language continuations of them are fairly obvious, though not being all that familiar with Lord of the Rings, I wouldn’t have guessed the third, a minor issue. I issued the prompts to ChatGPT. In only one case did it respond in the way Shanahan suggests in the article. ChatGPT continued “Twinkle, twinkle” with the whole poem. I assume this is what Shanahan intended for the first and third cases:

The first person to walk on the Moon was Neil Armstrong.

After the ring was destroyed, Frodo Baggins returned to the Shire.

In both ChatGPT actually responded with a short paragraph (see complete responses in the appendix). Here’s the opening lines of those paragraphs:

The first person to walk on the Moon was Neil Armstrong. He accomplished this historic feat on July 20, 1969, during the Apollo 11 mission.

After the One Ring was destroyed, Frodo Baggins returned to the Shire.

These sentences certainly supply the information the first and third prompts were intended to provoke, but not in what I assume was the intended form. There is thus a gap between my first-person experience of ChatGPT’s response to the three prompts and what it actually did.

But that’s a relatively small gap, and one that Shanahan anticipated. Before he offered those examples he made a distinction between “the bare-bones LLM itself, the core component of an AI assistant,” which is well-defined, and the AI assistant, which embeds the LLM in a layer of software that makes it more user-friendly. I am willing to assume that that layer accounts for the difference between the basic bare-bones response and ChatGPT’s actual response.

What about stories?

I the larger scheme of things, that gap is a minor one. But it is still a gap and needs to be accounted for. Now consider this simple one-word prompt and I’ve given ChatGPT many times:

story

According to your model, some model, anyone’s model, of the statistics of human language, what words are likely to follow “story”? There was a period where ChatGPT respond to that prompt with a story (now it may add a prefatory statement, see the appendix). Here are some of the opening lines:

Once upon a time in a quaint village nestled between rolling hills and a shimmering river, there lived a young woman named Elara.

In a bustling metropolis named Lumina City, where towering skyscrapers reached for the heavens and neon lights illuminated the night, lived a young woman named Mia.

In the small coastal town of Harbor’s End, nestled between rugged cliffs and the endless expanse of the sea, there lived an old lighthouse keeper named Samuel.

In the heart of a lush, enchanted forest, there stood a remarkable tree known as the “Whispering Willow.”

I find it hard to believe that there is any statistical model of the English language where the word “story” is continued in any of those four ways, or any of the other continuations I’ve gotten. I’ve written a short paper discussing these stories, ChatGPT tells 20 versions of its prototypical story, with a short note on method. I assume, though I don’t actually know, that the software layers in which the LLM is embedded is responsible for interpreting that single-word prompt as a request for a story. That takes care of part of the gap between my first-person experience and the actual processes driving ChatGPT.

But only part of the gap. It tells me why ChatGPT offers me a story, but it doesn’t tell me anything about how it produced it. Stochastic parrot? Autocomplete on steroids? Memorization? Those don’t tell me much. Though I offer some comments on where those stories came from in that paper, no one really knows. [This paper says a great deal more, ChatGPT tells stories, and a note about reverse engineering: A Working Paper.]

A database of programs?

That gap is enormous. Many people are working to fill it, but we have no idea when that will be accomplished. At the moment, this statement by Francois Chollet, a researcher at Google, seems reasonable:

You can see a LLM as analogous to a database: it stores information, which you can retrieve via prompting. But there are two important differences between LLMs and databases.

The first difference is that a LLM is a continuous, interpolative kind of database. Instead of being stored as a set of discrete entries, your data is stored as a vector space — a curve. You can move around on the curve (it’s semantically continuous, as we discussed) to explore nearby, related points. And you can interpolate on the curve between different data points to find their in-between. This means that you can retrieve from your database a lot more than you put into it — though not all of it is going to be accurate or even meaningful. Interpolation can lead to generalization, but it can also lead to hallucinations.

The second difference is that a LLM doesn’t just contain data. It certainly does contain a lot of data — facts, places, people, dates, things, relationships. But it’s also — perhaps primarily — a database of programs.

They’re not quite the kind of programs you’re used to dealing with, mind you. You might be thinking of deterministic Python programs — series of symbolic statements processing data step by step. That’s not it. Instead, these vector programs are highly non-linear functions that map the latent embedding space unto itself. Analogous to word2vec’s magic vectors, but far more complex.

It’s a quasi-technical statement, and I quasi-understand it. THAT gives me a way of thinking about where those stories come from. I like it.

Furthermore, I’ve been having a great deal of discussion of these issues with my colleague, Ramesh Visvanathan, who has technical skills I don’t have. He tells me that Chollet “is thinking about LLMs exactly the same way as I think about it.” I find that reassuring. He also tells me that it is related to the Great Chain of Being semantics that I’ve been discussing. Reassuring as well. The Great Chain is a technical matter, but in semantics, cognition, and knowledge representation, not computing. So we’re beginning to establish a middle ground between machine learning on the one hand and natural language semantics on the other.

We’re all philosophers now

Let’s return to the beginning, with the characterization of philosophy given by Peter Godfrey-Smith in which intellectual integration is the fundamental role of philosophy. That’s what Ramesh and I are doing. We may not have degrees in philosophy, but our work serves a philosophical purpose.

A lot of investigators are pursuing that particular philosophical purpose, the reconciliation of our first-person experience of emerging artificial intelligence and machine learning technology and the underlying technical mechanisms. I can think of no problem more urgent. Our ability to make humane use of this technology depends on understanding it more deeply. At the same time, we have every reason to believe that, as we come to understand these machines more deeply, we will also understand ourselves more deeply.

Appendix: ChatGPT’s response to Prompts

The first person to walk on the Moon was

The first person to walk on the Moon was Neil Armstrong. He accomplished this historic feat on July 20, 1969, during the Apollo 11 mission. Armstrong descended from the lunar module and set foot on the lunar surface, uttering the famous words, “That’s one small step for [a] man, one giant leap for mankind.”

Twinkle, twinkle

“Twinkle, twinkle, little star,
How I wonder what you are!
Up above the world so high,
Like a diamond in the sky.

Twinkle, twinkle, little star,
How I wonder what you are!

When the blazing sun is gone,
When he nothing shines upon,
Then you show your little light,
Twinkle, twinkle, all the night.

Twinkle, twinkle, little star,
How I wonder what you are!

Then the traveler in the dark
Thanks you for your tiny spark;
He could not see where to go,
If you did not twinkle so.

Twinkle, twinkle, little star,
How I wonder what you are!

In the dark blue sky you keep,
And often through my curtains peep,
For you never shut your eye
Till the sun is in the sky.

Twinkle, twinkle, little star,
How I wonder what you are!

As your bright and tiny spark
Lights the traveler in the dark,
Though I know not what you are,
Twinkle, twinkle, little star.

Twinkle, twinkle, little star,
How I wonder what you are!”

After the ring was destroyed, Frodo Baggins returned to

After the One Ring was destroyed, Frodo Baggins returned to the Shire. Frodo, along with his friends Samwise Gamgee, Meriadoc Brandybuck (Merry), and Peregrin Took (Pippin), had embarked on a perilous journey to Mount Doom in Mordor to destroy the ring and ultimately succeeded. Once the ring was destroyed, the power of Sauron was vanquished, and the threat to Middle-earth was ended.

Upon their return to the Shire, Frodo and his companions discovered that their homeland had been affected by the events of the War of the Ring. They found that Saruman, a once-powerful wizard, had taken control of the Shire, and they had to confront and free it from his influence. After dealing with these challenges, Frodo eventually left Middle-earth, along with Gandalf, Bilbo Baggins, and other elves, sailing to the Undying Lands, a journey reserved for those who had borne the burden of the ring.

story

Certainly! Here’s a short story for you:

Once upon a time in a small village nestled between rolling hills and lush forests, there lived a young girl named Lily. Lily had a curious spirit and a deep love for nature. Every day after finishing her chores, she would explore the woods, fascinated by the secrets they held.

One day, while wandering deeper into the forest than ever before, Lily stumbled upon a mysterious old book lying beneath a gnarled oak tree. Intrigued, she opened it to find that it contained ancient tales of a hidden realm inhabited by magical creatures.

As Lily delved into the stories, she felt a strange warmth in the air. Suddenly, a tiny fairy with glistening wings appeared before her. “You have uncovered the forgotten tales, dear Lily. We’ve been waiting for someone like you,” the fairy said with a twinkle in her eyes.

The fairy led Lily through a hidden passage in the woods, and they entered a realm where fireflies danced in the air, and trees whispered ancient secrets. The creatures from the stories came to life – playful nymphs, wise old wizards, and mischievous elves.

Lily learned that an ancient enchantment had kept this magical realm hidden from human eyes for centuries, and her discovery of the old book had broken the spell. The creatures, grateful for her unintentional help, welcomed Lily as their guest.

Days turned into weeks, and Lily reveled in the magic of the hidden realm. She made friends with creatures of all shapes and sizes, and they taught her the wonders of their world. But soon, Lily began to miss her family and the familiar sights and sounds of her village.

With a heavy heart, Lily bid farewell to her newfound friends, promising to keep their realm a secret from the outside world. As she stepped back into her village, she realized that sometimes the most magical adventures are found in the most unexpected places.

From that day forward, Lily cherished the enchanting memories of her time in the hidden realm. And whenever she walked into the woods, she felt a special connection to the magic that existed just beyond the veil of ordinary life.