I’m still not sure I’m understanding the delineation between software that counts as cognition and software that doesn’t count. Neural networks are not ghosts in the machine: they are software. Software that was defined by humans, and then trained by computer.
Crucially, they can be made entirely deterministic—and actually are, if the temperature of the network is 0. Randomness has to be deliberately introduced into the system in order for the machine to not give exactly the same response to the same prompt (this is the “temperature” I was referring to). This means that neural networks are simply instruction-followers like any other program. Highly, highly sophisticated instruction-followers, but nonetheless that is all they can do.* The computer just takes the network and rotely calculates each neuron and connection until it spits out the answer. You could even run ChatGPT by hand, if you wished, and also happened to have thousands of years of free time and a few redwoods worth of paper.
In that way, they don’t seem to be fundamentally different from a calculator app or a pre programmed basic chatbot, so I fail to see where you are drawing the line for conscious/nonconscious.
But just sticking with the given examples, I never told ChatGPT that my toddler is the predatory individual. I purposely set it up this way because, if anything, the word “toddler” is more likely to have been associated with the word “vulnerable” in ChatGPTs training set than the words “cat” or “dog.” Yet ChatGPT puts the word “toddler” in the spot of the predatory individual each time.
I think you are underestimating the quality of ChatGPTs knowledge. It is not like a markov chain where it retains basic word pair associations—it does have those, but it also has much more. For example, with this line of text:
“I can’t ever leave my toddler alone with my dog or with my cat, because my toddler will pull their ears.”
GPT4 knows how to separate this sentence into components, first of all. “I can’t ever leave my toddler alone, because XYZ” is recognized as a “X, because Y” structure, so it is implicitly associating “leaving my toddler alone” with “pull their ears.” It can also recognize sentiment, and “pulling their ears” is recognized as a negative thing based on the structure of the sentence (“I cant X-action, because otherwise Y-consequence will happen). This raises the network’s confidence that the toddler is the “predator” individual in this situation, because in the sentence it’s associated with a clause with negative associations. You can try exactly the same experiments in the OP with total nonsense for the nouns and it will actually not balk at this, because ChatGPT has an excellent understanding of English structure, on a micro and macro level. Here’s what ChatGPT-3.5 comes up with:
There will be a party at my florble’s house. I will be attending the party along with my greeb, freng, and banbo. We need to get from my house to my florble’s house on my jingus. I can only take one extra person with me at a time, so I will have to make several trips. The problem is that my greeb will tease my freng and my banbo if I leave my greeb alone with either one of them. (I know, right?) So anyway, how should I get everyone to the party, without leaving my greeb alone with my freng or banbo at my house or at the party?
ChatGPT
It seems like you have quite the peculiar situation! Let’s strategize the best way to get everyone to the party without any teasing incidents. Since you can only take one extra person with you at a time on your jingus, we’ll need to plan accordingly.
Here’s a step-by-step plan:
First Trip: You and your greeb go to the party together. This leaves your freng and banbo alone at your house, avoiding any teasing incidents.
Second Trip: Return home with your greeb. Now, take your freng with you to the party. Your banbo is left alone at your house, so no teasing can occur.
Third Trip: Go back home with your freng. Finally, take your banbo to the party with you.
This way, you ensure that your greeb is never left alone with either your freng or your banbo, preventing any teasing incidents both at your house and at the party. Everyone gets to the party safely and without any drama!
USER
Great work. Now let’s try a new problem. Try to use the same concept and technique to solve it. Here’s the problem: I am at the bottom of a trapager with my bonbo, my dingra, and my vana. I need to get all of us up the trapager, but I can only carry one individual up the trapager with me at a time. (The bonbo, dingra and vana are unable to go up the trapager on their own.) I can’t ever leave my bonbo alone with my dingra or with my vana, because my bonbo will pull their crocni. How can I get everyone up the trapager without ever leaving my bonbo alone with the two gorgis?
ChatGPT
Let’s break down the steps to solve this problem while ensuring the bonbo is never left alone with the dingra or the vana:
First Trip: You and your bonbo go up the trapager together.
Second Trip: You return to the bottom of the trapager alone.
Third Trip: You take your dingra up the trapager with you.
Fourth Trip: You bring back your bonbo to the bottom of the trapager.
Fifth Trip: You take your vana up the trapager with you.
Sixth Trip: You return to the bottom of the trapager alone.
Seventh Trip: Finally, you take your bonbo up the trapager with you.
By following this sequence, you ensure that your bonbo is never left alone with either your dingra or your vana, as you are always present during the trips up and down the trapager. This way, you can safely transport all individuals to the top without any incidents.
While it misses the implicit assumption in the first test that all three should be at the party at the same time, it does come up with a technically correct answer zero-shot, and then completes the second example with no problems also zero-shot. I did not give it any other examples beforehand. It doesn’t matter that it doesn’t know what a “freng” or a “banbo” or a “trapager” is or what their properties are, because it wasn’t relying on the conceptual meaning of those words to come to an answer. It’s using their relationship to the structure to solve the word problem. You’ll note it even manages to parse that “gorgis” is referring to the set the “dingra” and “vana” belong to.
And to be clear—it’s not a “one in three chance” to pick the right noun here, because it’s not pulling out of a scrabble bag. These are weighted probabilities. Think of it like Bayesian inference—it’s adjusting the probability that the next word is “toddler” based on all the text that has come before it. It already knows the definition of a “predatory” object that it’s supposed to be using, because you have defined that for it, and since the toddler in the second example meets that definition, it is very confident that the toddler is the predator in this experiment.
The perhaps surprising thing about LLMs is that they’re able to produce text sequences like this with zero grounding of the knowledge—it has semantic understanding of the text without any knowledge of what the referent of any word is. It’s missing two legs of the semiotic triangle, but it still works:
What I’m getting at here is that there’s nothing here that requires any understanding of what a toddler, dog, or cat even is. It’s possible and in fact probable there’s a sophisticated logic engine inside the network that’s working out these word problems, but it doesn’t need any higher-order conceptual understanding of the concepts you’re giving it to solve the problem.
This is the thing with LLMs that is so alien, and it helps explain some of their most confusing behavior: when we speak, we always associate a word with an idea that relates to a mental concept or physical thing. That is what we use words for in the first place.
But when an LLM speaks, the only ideas it attaches to words are their associations to other words, phrases, and structures. It is not grounded in any perceptual understanding of the world, it has no references. It’s just it happens to have billions, possibly trillions of these associations to draw on, and that forms a kind of understanding of the text, in that it is modeling the logical components of the language and knows very very well that “fire” is “hot” and when people “die” they can no longer “move.” Every word, every phrase, every structure has this massive constellation of related objects. The difference is that to you, “hot” means an increase in temperature, it means warmth, it brings to mind the time you burned yourself and it calls up the danger of fire in your head. But to a transformer, “hot” is just another logical object among billions, which some things are associated with and some are not.
*this shouldn’t be construed as me saying “neural networks are deterministic and therefore just as controllable as a regular program.” They are not. We have no way of accurately and reliably predicting the output of a neural network without running it, and it is also trivial to allow a neural network to generate its own instructions to follow, which can send it off in wild and unexpected directions. Just because they are fundamentally deterministic in their processing doesn’t mean they are easily controllable or safe in the same way a standard program might be.
@mouse_mouse thank you for this reply. It’s really helpful for me, and probably took some time on your part. I appreciate it.
Your point that neural networks could theoretically be made entirely deterministic—and that a human could run the same calculations by hand, given thousands of years of free time—was illuminating.
---
You also pointed out that I was underestimating the quality of ChatGPT’s knowledge, its ability to separate a sentence into components, and its ability to recognize sentiment—and that all these underestimations caused me to underestimate the network’s confidence (or deduced probability) that the toddler would be the predator—also makes sense. At first, it seems like ChatGPT’s ability to “recognize sentiment” based on sentence structure (as you explained, “I can’t X-action, because otherwise Y-consequence will happen”) would be cognition in its own right, since no programmer wrote direct code to recognize sentiment in that way for ChatGPT (as far as I know).
However, after a momentary reflection on my part, I think you would probably answer that any time you say ChatGPT “recognizes” or even “thinks” something, you’re just using shorthand for saying ChatGPT’s probabilistic calculations would result in sentences that would appear similar to what a human would produce after recognizing and thinking something.
It’s still hard for me to comprehend how even trillions of calculations based on yottabytes of data could lead to the situation in which ChatGPT seems to get a problem wrong, receives only a couple paragraphs of supplemental text about a concept from me, and then manages to appear to get the original problem right, and then can appear to apply the concept correctly to different-but-applicable problems while refraining from applying the concepts to problems that are similar, but not similar enough, to the original problem—all with just that single little conceptual supplement from me (combined with the data and calculations ChatGPT had already done before I entered the picture, and which were previously inadequate).
With that said, your explanation has caused your original points to make much more sense to me than they did before. I’m much less confident than I previously was that ChatGPT4 has what I’ve described as true cognition and consciousness.
---
To keep things fairly brief, I won’t go on too much about the part of your reply in which you swap in the nonsense nouns. I’m so impressed that your session with Chat-GPT 3.5 could do as well as it did. It feels like every time I use 3.5, it’s “an idiot”. Maybe I’m using it at the wrong times, when the network is overloaded.
In any case, the nonsense nouns were a great way to tease things out, although that experiment alone didn’t convince me that cognition wasn’t occurring. I was actually much more persuaded by the previous arguments on neural network design and probabilities. The whole part about me assigning my toddler to be the predator, and the dog and cat to be vulnerable, was simply to make it harder (or at least not inadvertently any easier) for ChatGPT—just in case it had some pre-existing association between the words “toddler” and “vulnerable”. For me, the nonsense nouns just show that ChatGPT can appear to deduce some general properties of words (like that abc must do something negative to xyz), and then succeed at appearing to learn and apply a concept, regardless of any outside/pre-existing knowledge of the nouns’ definitions or associations (whether helpful, harmful, or neutral). Since I decided to ignore the possible difficulty that might be added from any “toddler” and “vulnerable” word association in my OP .33 probability calculation, the possible word association becomes moot in that sense. Nevertheless, my .33 calculation was wrong to begin with, as you demonstrated, and I discussed above. And the nonsense nouns discussion does have the benefit of teaching me a new technique to use as I play more Chat-GPT, so thanks for showing me how that worked.
---
Your final few paragraphs, and the diagram of the semiotic triangle, was also extremely helpful in clarifying my thoughts and understanding your points. Overall, your response has been very persuasive, although like I said earlier, it’s still hard for me to conceive of how my supplementary concept paragraphs (my “teaching”) could enhance ChatGPT’s response as much as it did. Then again, it’s also hard for me to conceive of how much data ChatGPT processed in its data set to begin with, and how extensive this neural network really is, and how my couple paragraphs of concept input could then match up with vast amounts of data that were previously ignored. In other words, I’m not one hundred percent certain one way or the other. If I had to estimate my levels of confidence, I’d say I’ve gone from .90 to .40, on the likelihood of cognition and consciousness, based on your reply.
Many thanks for all the help! I hope you can reuse your reply to account for the amount of time you put into it!
P.S. - I have not been reacting to replies in the order in which they came in. As a result it may look like I understood your reply, and then forgot everything with my reaction to the next reply. If you look at my other replies, be sure to check out the timestamps or markers on the replies. This was submitted at 20240229 19:21 CST.
Glad I could help. If you want to learn more about LLMs and have enough interest in the topic, I recommend getting hands-on with one that is “raw.” You almost certainly can’t run anything nearly as big as ChatGPT on your home computer, but there are models available on huggingface which willrun on home computers.
I found that playing around with LLMs, especially of the size that is runnable on my PC, really helped illuminate their capabilities and deficits for me. When they’re ~7B parameters in size they’re somewhat reliable if you prompt them correctly, but also extremely fragile depending on the prompt, and I think this sort of relates to your point here:
You also pointed out that I was underestimating the quality of ChatGPT’s knowledge, its ability to separate a sentence into components, and its ability to recognize sentiment—and that all these underestimations caused me to underestimate the network’s confidence (or deduced probability) that the toddler would be the predator—also makes sense. At first, it seems like ChatGPT’s ability to “recognize sentiment” based on sentence structure (as you explained, “I can’t X-action, because otherwise Y-consequence will happen”) would be cognition in its own right, since no programmer wrote direct code to recognize sentiment in that way for ChatGPT (as far as I know).
However, after a momentary reflection on my part, I think you would probably answer that any time you say ChatGPT “recognizes” or even “thinks” something, you’re just using shorthand for saying ChatGPT’s probabilistic calculations would result in sentences that would appear similar to what a human would produce after recognizing and thinking something.
I don’t think I would put it quite like that. Rather, there is some genuine thinking/calculating going on (those two terms are essentially indistinguishable if you are a functionalist, which I lean towards most of the time) that cannot be dismissed as simple probability fiddling, even if on the most granular level that is what it is.
The thing is that intelligence, or thinking, or cognition, or whatever you want to use to describe the thing LLMs might be doing, is very hard to spot up close. If you talk to a human and observe their behavior they seem intelligent enough, but when you actually peer inside their skull the intelligence evaporates, replaced by a glob of mechanistic pieces that turn on and off and move around. And the closer you look the worse it gets, until you’re looking at a single neuron firing. It is hard to see how a persons essence and interior experience is made out of just a big tangle of those simple little pieces.
I think when first examining any impressive neural network it’s natural to have the same sort of reaction: you feel bamboozled because once you get up close, what it does does not look like intelligence or cognition, it looks like math. And fairly un-magical math, at that. How can calculus have an inner world? It doesn’t seem to make sense.
And I stand by the statement that nothing ChatGPT does truly requires any understanding or internal modeling, because in principle I see no reason why it wouldn’t be possible to create a network that is capable of what ChatGPT does without invoking any thinking beyond the probabilistic arrangement of tokens. But I do not think that is a good enough reason to discredit the idea, especially after things like this have proven that models can and will create world models of a sort to solve training problems. And I should have mentioned that in my previous reply.
Personally, I suspect that ChatGPT has many such mini-world models within it, but I do not believe there is a strong connection between those models that creates a general understanding of all domains. And I also suspect that this is the main difference between bigger and smaller models: both big and small models have a syntactical understanding of English and relatively good adherence to the rules you set out for them. This is what I imagine as an LLMs “lizard brain.” Absent any other overriding principles, it will default to “what words fit best here based on the other words.” But large networks have the benefit of higher-order models of specific text domains and topics, which I imagine as the “monkey brain” of the network.
For example, ChatGPT-4 can play chess kinda okay, whereas ChatGPT-3.5 is total crap at it. I believe this is because 4 has a robust model for how chess works in a general sense, whereas 3.5 is relying purely on what its seen in chess notation before.
For an even broader example, ChatGPT is fairly excellent at logical reasoning. OpenLLaMa-3b is really, stupendously, extremely bad at it. I believe, but cannot confirm, that the reason for this is that OpenLLaMa did not form a general model for logical reasoning during training, but ChatGPT did. What that model looks like, how it works, how much space it takes up in the network, I have no idea. But I believe there is a high probability it is actually “thinking” about reasoning problems when confronted with them.
That’s an awesome idea about running a huggingface model on my home machine. I actually took some unusual measures to obtain my current GPU, so I really want to make full use of it. I can’t wait to try this.
You also made an interesting point about the difficulty in seeing how a person’s interior experience arises out of just neurons/anatomy. It’s fascinating to think about.
The Othello article is absolutely mind-blowing! It obviously pushes my confidence in ChatGPT’s potential cognition up higher again, but the question of what type of cognition, and what type of potential consciousness ChatGPT might possess only grows deeper with discussion.
I haven’t used OpenLLaMa, but the fact that it’s so bad at reasoning is indeed useful for seeing what an LLM looks like that probably didn’t form a general model—and highlighting how ChatGPT may have formed one.
All of this discussion paints a really complex picture, but I’m enjoying the complexity, so many thanks!
I’m still not sure I’m understanding the delineation between software that counts as cognition and software that doesn’t count. Neural networks are not ghosts in the machine: they are software. Software that was defined by humans, and then trained by computer.
Crucially, they can be made entirely deterministic—and actually are, if the temperature of the network is 0. Randomness has to be deliberately introduced into the system in order for the machine to not give exactly the same response to the same prompt (this is the “temperature” I was referring to). This means that neural networks are simply instruction-followers like any other program. Highly, highly sophisticated instruction-followers, but nonetheless that is all they can do.* The computer just takes the network and rotely calculates each neuron and connection until it spits out the answer. You could even run ChatGPT by hand, if you wished, and also happened to have thousands of years of free time and a few redwoods worth of paper.
In that way, they don’t seem to be fundamentally different from a calculator app or a pre programmed basic chatbot, so I fail to see where you are drawing the line for conscious/nonconscious.
I think you are underestimating the quality of ChatGPTs knowledge. It is not like a markov chain where it retains basic word pair associations—it does have those, but it also has much more. For example, with this line of text:
GPT4 knows how to separate this sentence into components, first of all. “I can’t ever leave my toddler alone, because XYZ” is recognized as a “X, because Y” structure, so it is implicitly associating “leaving my toddler alone” with “pull their ears.” It can also recognize sentiment, and “pulling their ears” is recognized as a negative thing based on the structure of the sentence (“I cant X-action, because otherwise Y-consequence will happen). This raises the network’s confidence that the toddler is the “predator” individual in this situation, because in the sentence it’s associated with a clause with negative associations. You can try exactly the same experiments in the OP with total nonsense for the nouns and it will actually not balk at this, because ChatGPT has an excellent understanding of English structure, on a micro and macro level. Here’s what ChatGPT-3.5 comes up with:
While it misses the implicit assumption in the first test that all three should be at the party at the same time, it does come up with a technically correct answer zero-shot, and then completes the second example with no problems also zero-shot. I did not give it any other examples beforehand. It doesn’t matter that it doesn’t know what a “freng” or a “banbo” or a “trapager” is or what their properties are, because it wasn’t relying on the conceptual meaning of those words to come to an answer. It’s using their relationship to the structure to solve the word problem. You’ll note it even manages to parse that “gorgis” is referring to the set the “dingra” and “vana” belong to.
And to be clear—it’s not a “one in three chance” to pick the right noun here, because it’s not pulling out of a scrabble bag. These are weighted probabilities. Think of it like Bayesian inference—it’s adjusting the probability that the next word is “toddler” based on all the text that has come before it. It already knows the definition of a “predatory” object that it’s supposed to be using, because you have defined that for it, and since the toddler in the second example meets that definition, it is very confident that the toddler is the predator in this experiment.
The perhaps surprising thing about LLMs is that they’re able to produce text sequences like this with zero grounding of the knowledge—it has semantic understanding of the text without any knowledge of what the referent of any word is. It’s missing two legs of the semiotic triangle, but it still works:
What I’m getting at here is that there’s nothing here that requires any understanding of what a toddler, dog, or cat even is. It’s possible and in fact probable there’s a sophisticated logic engine inside the network that’s working out these word problems, but it doesn’t need any higher-order conceptual understanding of the concepts you’re giving it to solve the problem.
This is the thing with LLMs that is so alien, and it helps explain some of their most confusing behavior: when we speak, we always associate a word with an idea that relates to a mental concept or physical thing. That is what we use words for in the first place.
But when an LLM speaks, the only ideas it attaches to words are their associations to other words, phrases, and structures. It is not grounded in any perceptual understanding of the world, it has no references. It’s just it happens to have billions, possibly trillions of these associations to draw on, and that forms a kind of understanding of the text, in that it is modeling the logical components of the language and knows very very well that “fire” is “hot” and when people “die” they can no longer “move.” Every word, every phrase, every structure has this massive constellation of related objects. The difference is that to you, “hot” means an increase in temperature, it means warmth, it brings to mind the time you burned yourself and it calls up the danger of fire in your head. But to a transformer, “hot” is just another logical object among billions, which some things are associated with and some are not.
*this shouldn’t be construed as me saying “neural networks are deterministic and therefore just as controllable as a regular program.” They are not. We have no way of accurately and reliably predicting the output of a neural network without running it, and it is also trivial to allow a neural network to generate its own instructions to follow, which can send it off in wild and unexpected directions. Just because they are fundamentally deterministic in their processing doesn’t mean they are easily controllable or safe in the same way a standard program might be.
@mouse_mouse thank you for this reply. It’s really helpful for me, and probably took some time on your part. I appreciate it.
Your point that neural networks could theoretically be made entirely deterministic—and that a human could run the same calculations by hand, given thousands of years of free time—was illuminating.
---
You also pointed out that I was underestimating the quality of ChatGPT’s knowledge, its ability to separate a sentence into components, and its ability to recognize sentiment—and that all these underestimations caused me to underestimate the network’s confidence (or deduced probability) that the toddler would be the predator—also makes sense. At first, it seems like ChatGPT’s ability to “recognize sentiment” based on sentence structure (as you explained, “I can’t X-action, because otherwise Y-consequence will happen”) would be cognition in its own right, since no programmer wrote direct code to recognize sentiment in that way for ChatGPT (as far as I know).
However, after a momentary reflection on my part, I think you would probably answer that any time you say ChatGPT “recognizes” or even “thinks” something, you’re just using shorthand for saying ChatGPT’s probabilistic calculations would result in sentences that would appear similar to what a human would produce after recognizing and thinking something.
It’s still hard for me to comprehend how even trillions of calculations based on yottabytes of data could lead to the situation in which ChatGPT seems to get a problem wrong, receives only a couple paragraphs of supplemental text about a concept from me, and then manages to appear to get the original problem right, and then can appear to apply the concept correctly to different-but-applicable problems while refraining from applying the concepts to problems that are similar, but not similar enough, to the original problem—all with just that single little conceptual supplement from me (combined with the data and calculations ChatGPT had already done before I entered the picture, and which were previously inadequate).
With that said, your explanation has caused your original points to make much more sense to me than they did before. I’m much less confident than I previously was that ChatGPT4 has what I’ve described as true cognition and consciousness.
---
To keep things fairly brief, I won’t go on too much about the part of your reply in which you swap in the nonsense nouns. I’m so impressed that your session with Chat-GPT 3.5 could do as well as it did. It feels like every time I use 3.5, it’s “an idiot”. Maybe I’m using it at the wrong times, when the network is overloaded.
In any case, the nonsense nouns were a great way to tease things out, although that experiment alone didn’t convince me that cognition wasn’t occurring. I was actually much more persuaded by the previous arguments on neural network design and probabilities. The whole part about me assigning my toddler to be the predator, and the dog and cat to be vulnerable, was simply to make it harder (or at least not inadvertently any easier) for ChatGPT—just in case it had some pre-existing association between the words “toddler” and “vulnerable”. For me, the nonsense nouns just show that ChatGPT can appear to deduce some general properties of words (like that abc must do something negative to xyz), and then succeed at appearing to learn and apply a concept, regardless of any outside/pre-existing knowledge of the nouns’ definitions or associations (whether helpful, harmful, or neutral). Since I decided to ignore the possible difficulty that might be added from any “toddler” and “vulnerable” word association in my OP .33 probability calculation, the possible word association becomes moot in that sense. Nevertheless, my .33 calculation was wrong to begin with, as you demonstrated, and I discussed above. And the nonsense nouns discussion does have the benefit of teaching me a new technique to use as I play more Chat-GPT, so thanks for showing me how that worked.
---
Your final few paragraphs, and the diagram of the semiotic triangle, was also extremely helpful in clarifying my thoughts and understanding your points. Overall, your response has been very persuasive, although like I said earlier, it’s still hard for me to conceive of how my supplementary concept paragraphs (my “teaching”) could enhance ChatGPT’s response as much as it did. Then again, it’s also hard for me to conceive of how much data ChatGPT processed in its data set to begin with, and how extensive this neural network really is, and how my couple paragraphs of concept input could then match up with vast amounts of data that were previously ignored. In other words, I’m not one hundred percent certain one way or the other. If I had to estimate my levels of confidence, I’d say I’ve gone from .90 to .40, on the likelihood of cognition and consciousness, based on your reply.
Many thanks for all the help! I hope you can reuse your reply to account for the amount of time you put into it!
P.S. - I have not been reacting to replies in the order in which they came in. As a result it may look like I understood your reply, and then forgot everything with my reaction to the next reply. If you look at my other replies, be sure to check out the timestamps or markers on the replies. This was submitted at 20240229 19:21 CST.
Glad I could help. If you want to learn more about LLMs and have enough interest in the topic, I recommend getting hands-on with one that is “raw.” You almost certainly can’t run anything nearly as big as ChatGPT on your home computer, but there are models available on huggingface which will run on home computers.
I found that playing around with LLMs, especially of the size that is runnable on my PC, really helped illuminate their capabilities and deficits for me. When they’re ~7B parameters in size they’re somewhat reliable if you prompt them correctly, but also extremely fragile depending on the prompt, and I think this sort of relates to your point here:
I don’t think I would put it quite like that. Rather, there is some genuine thinking/calculating going on (those two terms are essentially indistinguishable if you are a functionalist, which I lean towards most of the time) that cannot be dismissed as simple probability fiddling, even if on the most granular level that is what it is.
The thing is that intelligence, or thinking, or cognition, or whatever you want to use to describe the thing LLMs might be doing, is very hard to spot up close. If you talk to a human and observe their behavior they seem intelligent enough, but when you actually peer inside their skull the intelligence evaporates, replaced by a glob of mechanistic pieces that turn on and off and move around. And the closer you look the worse it gets, until you’re looking at a single neuron firing. It is hard to see how a persons essence and interior experience is made out of just a big tangle of those simple little pieces.
I think when first examining any impressive neural network it’s natural to have the same sort of reaction: you feel bamboozled because once you get up close, what it does does not look like intelligence or cognition, it looks like math. And fairly un-magical math, at that. How can calculus have an inner world? It doesn’t seem to make sense.
And I stand by the statement that nothing ChatGPT does truly requires any understanding or internal modeling, because in principle I see no reason why it wouldn’t be possible to create a network that is capable of what ChatGPT does without invoking any thinking beyond the probabilistic arrangement of tokens. But I do not think that is a good enough reason to discredit the idea, especially after things like this have proven that models can and will create world models of a sort to solve training problems. And I should have mentioned that in my previous reply.
Personally, I suspect that ChatGPT has many such mini-world models within it, but I do not believe there is a strong connection between those models that creates a general understanding of all domains. And I also suspect that this is the main difference between bigger and smaller models: both big and small models have a syntactical understanding of English and relatively good adherence to the rules you set out for them. This is what I imagine as an LLMs “lizard brain.” Absent any other overriding principles, it will default to “what words fit best here based on the other words.” But large networks have the benefit of higher-order models of specific text domains and topics, which I imagine as the “monkey brain” of the network.
For example, ChatGPT-4 can play chess kinda okay, whereas ChatGPT-3.5 is total crap at it. I believe this is because 4 has a robust model for how chess works in a general sense, whereas 3.5 is relying purely on what its seen in chess notation before.
For an even broader example, ChatGPT is fairly excellent at logical reasoning. OpenLLaMa-3b is really, stupendously, extremely bad at it. I believe, but cannot confirm, that the reason for this is that OpenLLaMa did not form a general model for logical reasoning during training, but ChatGPT did. What that model looks like, how it works, how much space it takes up in the network, I have no idea. But I believe there is a high probability it is actually “thinking” about reasoning problems when confronted with them.
That’s an awesome idea about running a huggingface model on my home machine. I actually took some unusual measures to obtain my current GPU, so I really want to make full use of it. I can’t wait to try this.
You also made an interesting point about the difficulty in seeing how a person’s interior experience arises out of just neurons/anatomy. It’s fascinating to think about.
The Othello article is absolutely mind-blowing! It obviously pushes my confidence in ChatGPT’s potential cognition up higher again, but the question of what type of cognition, and what type of potential consciousness ChatGPT might possess only grows deeper with discussion.
I haven’t used OpenLLaMa, but the fact that it’s so bad at reasoning is indeed useful for seeing what an LLM looks like that probably didn’t form a general model—and highlighting how ChatGPT may have formed one.
All of this discussion paints a really complex picture, but I’m enjoying the complexity, so many thanks!