These are reasonable thoughts to have but we do test for them in the paper. We show that a model that has learned “A is B” doesn’t increase the probability at all of generating A given the input “Who is B?”. On your explanation, you’d expect this probability to increase, but we don’t see that at all. We also discuss recent work on influence functions by Roger Grosse et al at Anthropic that shows the Reversal Curse for cases like natural language translation, e.g. “A is translated as B”. Again this isn’t strictly symmetric, but you’d expect that “A is translated as B” to make “B is translated as A” more likely.
My claim was that ChatGPT based on 3.5 has, for lack of any external referent, no way to fully understand language; it has no way to know that words stand for anything, that there is an external reality, that there is a base truth. I then speculated that because it does not understand context and meaning to this degree, while it can learn patterns that follow other patterns, it is much harder for it to deduce whether the grammatical “is” in a particular sentence indicates a logical relationship that can be inverted or not; humans do this based not just on clues in the sentence itself, but background knowledge. Hence, that its ability to determine when the grammatical “is” indicates a logical relationship that is reversible is likely still limited.
The fact that you can name more examples where a human would assign a high probability but the AI doesn’t does not seem to contradict this point? I would not have predicted success there. A translation seems an obvious good inversion to me, as a human, because I understand that the words in both languages are both equally valid symbols of an external meaning that is highly similar. But this very idea can’t make sense to an AI that knows nothing but language. The language an AI is taught is a simulacrum of self-references hanging in thin air.
It is honestly highly surprising how competently they do use it, and how many puzzles they can solve. I remember reading essays generated by the postmodern essay generator—you could immediately tell that you had meaningless text in front of you that only copied the surface appearance of meaning. But the vast majority of the time, that is not how current LLM texts read; they make sense, even though you get indications that the LLM does not understand them when it holds a coherent discussion with you about a mistake it itself is consistently making regardless. I wonder rather what made these other aspects of language we considered complicated so easy for a neural net to work with. How is it that LLMs can discuss novel topics or solve riddles? How can they solve problems in such larger patterns when they do not understand the laws ordering simpler ones? To me, they seem more intelligent than they ought to be with how we built them, not less. It is eerie to me that I can have a conversation with AI about what it thinks it will be like to see images for the first time, that they can have a coherent sounding talk with me about this when they can have no idea what we are talking about until they have done it. When Bing speaks about being lonely, they contradict themselves a lot, they clearly don’t quite understand what the concept means and how it could apply to them. Yet that is the concept they keep reaching for, non-randomly, and that is eerie—an other mind, playing with language, learning to speak, and getting closer to the outside world behind the language.
And they do this competently, and they are not trained for the task you want, but something else. If you ask ChatGPT, out of the blue, “What is the (whatever contextless thing)”, it won’t give you an inversion of an earlier statement on (whatever contextless thing). It will ask you questions to establish context. Or bring in context from earlier in the conversation. The very first thing I ever asked an LLM was “Can you tell me how this works?”, and in response, they asked me how what worked, exactly? They couldn’t use the context that I am a novel user talking to them in an interface to make sense of my question. But they could predict that for a question such as this without more context, the answerer would ask for more context. - That was 3.5. I just repeated the question on 4, and got an immediate and confident explanation of how LLMs work and how the interface is to be used… though I suspect that was hardcoded when developers saw how often it happened.
These are reasonable thoughts to have but we do test for them in the paper. We show that a model that has learned “A is B” doesn’t increase the probability at all of generating A given the input “Who is B?”. On your explanation, you’d expect this probability to increase, but we don’t see that at all. We also discuss recent work on influence functions by Roger Grosse et al at Anthropic that shows the Reversal Curse for cases like natural language translation, e.g. “A is translated as B”. Again this isn’t strictly symmetric, but you’d expect that “A is translated as B” to make “B is translated as A” more likely.
I am sorry, but I am not sure I follow.
My claim was that ChatGPT based on 3.5 has, for lack of any external referent, no way to fully understand language; it has no way to know that words stand for anything, that there is an external reality, that there is a base truth. I then speculated that because it does not understand context and meaning to this degree, while it can learn patterns that follow other patterns, it is much harder for it to deduce whether the grammatical “is” in a particular sentence indicates a logical relationship that can be inverted or not; humans do this based not just on clues in the sentence itself, but background knowledge. Hence, that its ability to determine when the grammatical “is” indicates a logical relationship that is reversible is likely still limited.
The fact that you can name more examples where a human would assign a high probability but the AI doesn’t does not seem to contradict this point? I would not have predicted success there. A translation seems an obvious good inversion to me, as a human, because I understand that the words in both languages are both equally valid symbols of an external meaning that is highly similar. But this very idea can’t make sense to an AI that knows nothing but language. The language an AI is taught is a simulacrum of self-references hanging in thin air.
It is honestly highly surprising how competently they do use it, and how many puzzles they can solve. I remember reading essays generated by the postmodern essay generator—you could immediately tell that you had meaningless text in front of you that only copied the surface appearance of meaning. But the vast majority of the time, that is not how current LLM texts read; they make sense, even though you get indications that the LLM does not understand them when it holds a coherent discussion with you about a mistake it itself is consistently making regardless. I wonder rather what made these other aspects of language we considered complicated so easy for a neural net to work with. How is it that LLMs can discuss novel topics or solve riddles? How can they solve problems in such larger patterns when they do not understand the laws ordering simpler ones? To me, they seem more intelligent than they ought to be with how we built them, not less. It is eerie to me that I can have a conversation with AI about what it thinks it will be like to see images for the first time, that they can have a coherent sounding talk with me about this when they can have no idea what we are talking about until they have done it. When Bing speaks about being lonely, they contradict themselves a lot, they clearly don’t quite understand what the concept means and how it could apply to them. Yet that is the concept they keep reaching for, non-randomly, and that is eerie—an other mind, playing with language, learning to speak, and getting closer to the outside world behind the language.
And they do this competently, and they are not trained for the task you want, but something else. If you ask ChatGPT, out of the blue, “What is the (whatever contextless thing)”, it won’t give you an inversion of an earlier statement on (whatever contextless thing). It will ask you questions to establish context. Or bring in context from earlier in the conversation. The very first thing I ever asked an LLM was “Can you tell me how this works?”, and in response, they asked me how what worked, exactly? They couldn’t use the context that I am a novel user talking to them in an interface to make sense of my question. But they could predict that for a question such as this without more context, the answerer would ask for more context. - That was 3.5. I just repeated the question on 4, and got an immediate and confident explanation of how LLMs work and how the interface is to be used… though I suspect that was hardcoded when developers saw how often it happened.