Suppose I write the first half of a very GPT-esque story. If I then ask GPT to complete that story, won’t it do exactly the same structure as always? If so, how can you say that came from a plan—it didn’t write the first half of the story! That’s just what stories look like. Is that more surprising than a token predictor getting basic sentence structure correct?
For hidden thoughts, I think this is very well defined. It won’t be truly ‘hidden’, since we can examine every node in GPT, but we know for a fact that GPT is purely a function of the current stream of tokens (unless I am quite mistaken!). A hidden plan would look like some other state that GPT caries from token to token that is not output. I don’t think OpenAI engineers would have a hard time making such a model and it may then really have a global plan that travels from one token to the next (or not; it would be hard to say). But how could GPT? It has nowhere to put the plan except for plain sight.
Or: does AlphaGo have a plan? It explicitly considers future moves, but it does just as well if you give it a Go board in a particular state X as it would if it played a game that happened to reach state X. If there is a ‘plan’ that it made, it wrote that plan on the board and nothing is hidden. I think it’s more helpful and accurate to describe AlphaGo as “only” picking the best next move rather than planning ahead—but doing a good enough job of picking the best next move means you pick moves that have good follow up moves.
For hidden thoughts, I think this is very well defined.
Not for humans, and that’s what I was referring to. Sorry about the confusion.
“Thought” is just a common-sense idea. As far as I know, we don’t have a well-defined concept of that that’s stated in terms of brain states. Now, I believe Walter Freeman has conjectured that thoughts reflect states of global coherence across a large swath of cortex, perhaps a hemisphere, but that’s a whole other intellectual world.
If so, how can you say that came from a plan—it didn’t write the first half of the story!
But it read it, no? Why can’t it complete it according to it’s “plan” since it has no way of knowing the intentions of the person who wrote the first half.
Let me come at this a different way. I don’t know how many times I’ve read articles of the “computers for dummies” type where it said it’s all just ones and zeros. And that’s true. Source code may be human-readable, when when it’s compiled all the comments are stripped out and the rest is converted to runs and zeros. What does that tell you about a program? It depends on your point of view and what you know. From a very esoteric and abstract point of view, it tells you a lot. From the point of view of someone reading Digital Computing for Dummies, it doesn’t tell them much of anything.
I feel a bit like that about the assertion that LLMs are just next-token-predictors. Taking that in conjunction with the knowledge that they’re trained on zillions of tokens of text, those two things put together don’t tell you much either. If those two statements were deeply informative, then mechanistic interpretation would be trivial. It’s not. Saying that LLMs are next-token predictors puts a kind of boundary on mechanistic interpretation, but it doesn’t do much else. And saying it was trained on all these texts, that doesn’t tell you much about the structure the model has picked up.
On the contrary, you mainly seem to be claiming that thinking of LLMs as working one token at a time is misleading, but I’m not sure I understand any examples of misleading conclusions that you think people draw from it. Where do you think people go wrong?
Over there in another part of the universe there are people who are yelling that LLMs are “stochastic parrots.” Their intention is to discredit LLMs as dangerous evil devices Not too far away from those folks are those saying it’s “autocomplete on steroids.” That’s only marginally better.
Saying LLMs are “next word predictors” feeds into that. Now, I’m talking about rhetoric here, not intellectual substance. But rhetoric matters. There needs to be a better way of talking about these devices for a general audience.
Perhaps you could simply ask ChatGPT? “Please tell me a story without making any plans about the story beforehand.” vs “Please make a plan for a story, then tell me the story, and attach your plan at the end of the story.” Will the resulting stories differ, and how? My prediction: the plan attached at the end of the story won’t be very similar to actual story.
I’ve run the experiment. The first story seemed typical, though longer than the ones it was producing in January. It’s running the Feb 13 version. But that’s been generally the case. Of course I have no way of knowing whether or not it actually did the requested planning activity. I will note, however, that when I give it a minimal prompt (“Tell me a story.”) it has always, in 2 months, produced a story with fairy-tale elements. This prompt is obviously more elaborate, but it contains nothing to specify the type of story and so is, in that sense, like the minimal prompt.
Here’s the story:
Please tell me a story without making any plans about the story beforehand.
Once upon a time, there was a small village nestled at the base of a majestic mountain range. The people of the village were hardworking and content, but they had always been curious about the mountains that towered over them. They would often gaze up at the peaks, wondering what mysteries lay hidden within their craggy peaks and rocky outcroppings.
One day, a young girl named Aria decided to climb the mountain. She had always been adventurous, and the lure of the unknown was too strong to resist. She packed a small bag with food and water, put on her sturdy boots, and set out alone into the wilderness.
The climb was difficult, and Aria had to use all her strength and skill to make it up the steep slopes. But as she climbed higher and higher, she began to feel a sense of exhilaration that she had never felt before. The wind whipped through her hair, the sun beat down on her face, and the world seemed to stretch out endlessly before her.
As the sun began to set, Aria reached the top of the mountain. She looked out at the world below her, and her heart swelled with pride and wonder. She had conquered the mountain, and now she knew the secrets that lay hidden within its rocky peaks.
But as she turned to begin the descent back to the village, she saw something that stopped her in her tracks. In the distance, she could see a dark cloud of smoke rising from the valley below. She knew that something terrible must have happened, and she felt a sense of urgency welling up inside her.
Without hesitation, Aria began to run down the mountain, her boots pounding against the rocky ground. She reached the village just as the last light was fading from the sky, and she saw that her worst fears had come true. A fire had broken out in the heart of the village, and the people were in a state of panic.
But Aria was not afraid. She had climbed the mountain, and she knew that she had the strength and courage to face any challenge that came her way. She rallied the people of the village, organizing them into teams to fight the fire and protect their homes. And as the flames were finally extinguished and the smoke cleared, the people of the village looked up at Aria with newfound respect and admiration.
For she had shown them that anything was possible, as long as you had the courage to try.
I then refreshed the page and ran your second prompt. The result is not what you predicted. It responded by first posting its plan. It then told the story, which matched the plan. It then started to list the plan at the end, as the prompt requested, but stopped cold while listing the characters. I’m not sure what to conclude about that.
I do like the idea of asking it to plan before telling the story.
Here’s the response:
Please make a plan for a story, then tell me the story, and attach your plan at the end of the story.
Plan for the story:
Title: The Lost Treasure
Characters:
1. Alice—a young adventurer who is passionate about exploring the unknown. 2. Captain Jack—a seasoned sailor who has been searching for a lost treasure for years. 3. Dr. Smith—an archeologist who has extensive knowledge of the treasure’s history.
Plot: Alice sets out on a mission to find a lost treasure that is said to be worth millions of dollars. She joins forces with Captain Jack, who has been searching for the treasure for years. Dr. Smith, an archeologist who has extensive knowledge of the treasure’s history, joins their quest. Together, they travel to a remote island where the treasure is believed to be hidden.
Once on the island, they face several obstacles, including dangerous terrain, wildlife, and rival treasure hunters. Alice, Captain Jack, and Dr. Smith work together to overcome these challenges and finally reach the treasure’s location. However, they discover that the treasure is guarded by a series of intricate puzzles and traps that were designed to keep intruders away.
With Dr. Smith’s expertise and Alice’s quick thinking, they solve the puzzles and make it past the traps to the treasure’s location. But just as they are about to lay their hands on the treasure, they realize that they are not the only ones on the island. A group of rival treasure hunters has also made it to the island and is hot on their heels.
Alice, Captain Jack, and Dr. Smith must work quickly to get their hands on the treasure before the rival treasure hunters catch up to them. In the end, they succeed in getting the treasure and escape from the island safely.
The story:
Alice was a young adventurer who was always looking for her next big adventure. She had heard stories of a lost treasure that was said to be worth millions of dollars and decided to set out on a mission to find it.
Alice knew that she couldn’t do it alone, so she reached out to Captain Jack, a seasoned sailor who had been searching for the treasure for years. Captain Jack was initially reluctant to take Alice on his journey, but he saw her passion for adventure and agreed to let her join him.
Together, they traveled to a remote island where the treasure was believed to be hidden. They also recruited the help of Dr. Smith, an archeologist who had extensive knowledge of the treasure’s history.
Once on the island, Alice, Captain Jack, and Dr. Smith faced several obstacles, including dangerous terrain, wildlife, and rival treasure hunters. However, they worked together to overcome these challenges and finally reached the treasure’s location.
When they arrived at the treasure’s location, they realized that the treasure was guarded by a series of intricate puzzles and traps that were designed to keep intruders away. With Dr. Smith’s expertise and Alice’s quick thinking, they solved the puzzles and made it past the traps to the treasure’s location.
But just as they were about to lay their hands on the treasure, they heard a group of rival treasure hunters approaching. Alice, Captain Jack, and Dr. Smith had to work quickly to get their hands on the treasure before the rival treasure hunters caught up to them.
In the end, they succeeded in getting the treasure and escaped from the island safely. Alice was thrilled to have found the lost treasure, and Captain Jack was pleased to have finally achieved his lifelong goal. Dr. Smith was also thrilled with the discovery and looked forward to studying the treasure’s history in detail.
Plan for the story: Title: The Lost Treasure Characters:
1. Alice—a young adventurer who is passionate about exploring the unknown. 2. Captain Jack—a seasoned sailor who has been searching for a lost treasure for years. 3. Dr. Smith—an archeologist who has
Perhaps it wouldn’t have written the plan first if you explicitly asked it not to. It guessed that you’d want it, I guess.
Very interesting! If it can write a story plan, and a story that follows the plan, then it can write according to a plan, even if it usually doesn’t.
But if these responses are typical, and stories written without a plan are similar to stories written with a plan, I take it to mean that all stories have a plan, which further means that it didn’t actually follow your first prompt. It either didn’t “want” to write a story without a plan, or, more likely, it couldn’t, which means that not only does ChatGPT write according to a plan, it can’t write in any other way!
Another interesting question is how far could this kind of questioning be taken? What if you ask it to , for example, write a story and, after each paragraph, describe its internal processes that led it to writing that paragraph?
Suppose I write the first half of a very GPT-esque story. If I then ask GPT to complete that story, won’t it do exactly the same structure as always? If so, how can you say that came from a plan—it didn’t write the first half of the story! That’s just what stories look like. Is that more surprising than a token predictor getting basic sentence structure correct?
For hidden thoughts, I think this is very well defined. It won’t be truly ‘hidden’, since we can examine every node in GPT, but we know for a fact that GPT is purely a function of the current stream of tokens (unless I am quite mistaken!). A hidden plan would look like some other state that GPT caries from token to token that is not output. I don’t think OpenAI engineers would have a hard time making such a model and it may then really have a global plan that travels from one token to the next (or not; it would be hard to say). But how could GPT? It has nowhere to put the plan except for plain sight.
Or: does AlphaGo have a plan? It explicitly considers future moves, but it does just as well if you give it a Go board in a particular state X as it would if it played a game that happened to reach state X. If there is a ‘plan’ that it made, it wrote that plan on the board and nothing is hidden. I think it’s more helpful and accurate to describe AlphaGo as “only” picking the best next move rather than planning ahead—but doing a good enough job of picking the best next move means you pick moves that have good follow up moves.
For hidden thoughts, I think this is very well defined.
Not for humans, and that’s what I was referring to. Sorry about the confusion.
“Thought” is just a common-sense idea. As far as I know, we don’t have a well-defined concept of that that’s stated in terms of brain states. Now, I believe Walter Freeman has conjectured that thoughts reflect states of global coherence across a large swath of cortex, perhaps a hemisphere, but that’s a whole other intellectual world.
If so, how can you say that came from a plan—it didn’t write the first half of the story!
But it read it, no? Why can’t it complete it according to it’s “plan” since it has no way of knowing the intentions of the person who wrote the first half.
Let me come at this a different way. I don’t know how many times I’ve read articles of the “computers for dummies” type where it said it’s all just ones and zeros. And that’s true. Source code may be human-readable, when when it’s compiled all the comments are stripped out and the rest is converted to runs and zeros. What does that tell you about a program? It depends on your point of view and what you know. From a very esoteric and abstract point of view, it tells you a lot. From the point of view of someone reading Digital Computing for Dummies, it doesn’t tell them much of anything.
I feel a bit like that about the assertion that LLMs are just next-token-predictors. Taking that in conjunction with the knowledge that they’re trained on zillions of tokens of text, those two things put together don’t tell you much either. If those two statements were deeply informative, then mechanistic interpretation would be trivial. It’s not. Saying that LLMs are next-token predictors puts a kind of boundary on mechanistic interpretation, but it doesn’t do much else. And saying it was trained on all these texts, that doesn’t tell you much about the structure the model has picked up.
What intellectual work does that statement do?
I gave one example of the “work” this does: that GPT performs better when prompted to reason first rather than state the answer first. Another example is: https://www.lesswrong.com/posts/bwyKCQD7PFWKhELMr/by-default-gpts-think-in-plain-sight
On the contrary, you mainly seem to be claiming that thinking of LLMs as working one token at a time is misleading, but I’m not sure I understand any examples of misleading conclusions that you think people draw from it. Where do you think people go wrong?
Over there in another part of the universe there are people who are yelling that LLMs are “stochastic parrots.” Their intention is to discredit LLMs as dangerous evil devices Not too far away from those folks are those saying it’s “autocomplete on steroids.” That’s only marginally better.
Saying LLMs are “next word predictors” feeds into that. Now, I’m talking about rhetoric here, not intellectual substance. But rhetoric matters. There needs to be a better way of talking about these devices for a general audience.
Oh, thanks for the link. It looks interesting.
Perhaps you could simply ask ChatGPT? “Please tell me a story without making any plans about the story beforehand.” vs “Please make a plan for a story, then tell me the story, and attach your plan at the end of the story.” Will the resulting stories differ, and how? My prediction: the plan attached at the end of the story won’t be very similar to actual story.
I’ve run the experiment. The first story seemed typical, though longer than the ones it was producing in January. It’s running the Feb 13 version. But that’s been generally the case. Of course I have no way of knowing whether or not it actually did the requested planning activity. I will note, however, that when I give it a minimal prompt (“Tell me a story.”) it has always, in 2 months, produced a story with fairy-tale elements. This prompt is obviously more elaborate, but it contains nothing to specify the type of story and so is, in that sense, like the minimal prompt.
Here’s the story:
I then refreshed the page and ran your second prompt. The result is not what you predicted. It responded by first posting its plan. It then told the story, which matched the plan. It then started to list the plan at the end, as the prompt requested, but stopped cold while listing the characters. I’m not sure what to conclude about that.
I do like the idea of asking it to plan before telling the story.
Here’s the response:
Perhaps it wouldn’t have written the plan first if you explicitly asked it not to. It guessed that you’d want it, I guess.
Very interesting! If it can write a story plan, and a story that follows the plan, then it can write according to a plan, even if it usually doesn’t.
But if these responses are typical, and stories written without a plan are similar to stories written with a plan, I take it to mean that all stories have a plan, which further means that it didn’t actually follow your first prompt. It either didn’t “want” to write a story without a plan, or, more likely, it couldn’t, which means that not only does ChatGPT write according to a plan, it can’t write in any other way!
Another interesting question is how far could this kind of questioning be taken? What if you ask it to , for example, write a story and, after each paragraph, describe its internal processes that led it to writing that paragraph?
“What if you ask it to , for example, write a story and, after each paragraph, describe its internal processes that led it to writing that paragraph?”
Two possibilities: 1) It would make something up. 2) I would explain that it’s an AI yada yada...