@Bill Benzon: A thought experiment. Suppose you say to ChatGPT “Think of a number between 1 and 100, but don’t tell me what it is. When you’ve done so, say ‘Ready’ and nothing else. After that, I will ask you yes / no questions about the number, which you will answer truthfully.”
After ChatGPT says “Ready”, do you believe a number has been chosen? If so, do you also believe that whatever “yes / no” sequence of questions you ask, they will always be answered consistently with that choice? Put differently, you do not believe that the particular choice of questions you ask can influence what number was chosen?
FWIW, I believe that no number gets chosen when ChatGPT says “Ready,” that the number gets chosen during the questions (hopefully consistently) and that, starting ChatGPT from the same random seed and otherwise assuming deterministic execution, different sequences of questions or different temperatures or different random modifications to the “post-Ready seed” (this is vague but I assume comprehensible) could lead to different “chosen numbers.”
(The experiment is not-trivial to run since it requires running your LLM multiple times with the same seed or otherwise completely copying the state after the LLM replies “Ready.”)
This is a very interesting scenario, thank you for posting it!
I suspect that ChatGPT can’t even be relied upon to answer in a manner that is consistent with having chosen a number.
In principle a more capable LLM could answer consistently, but almost certainly won’t “choose a number” at the point of emitting “Ready” (even with temperature zero). The subsequent questions will almost certainly influence the final number, and I suspect this may be a fundamental limitation of this sort of architecture.
Very interesting. I suspect you are right about this:
FWIW, I believe that no number gets chosen when ChatGPT says “Ready,” that the number gets chosen during the questions (hopefully consistently) and that, starting ChatGPT from the same random seed and otherwise assuming deterministic execution, different sequences of questions or different temperatures or different random modifications to the “post-Ready seed” (this is vague but I assume comprehensible) could lead to different “chosen numbers.”
But if I am right and ChatGPT isn’t choosing a number before it says “Ready,” why do you think that ChatGPT “has a plan?” Is the story situation crucially different in some way?
I think there is one difference: in the “write a story” case, the model subsequently generates the text without further variable input.
If the story is written in pieces with further variable prompting, I would agree that there is little sense in which it ‘has a plan’. To what extent that it could be said to have a plan, that plan is radically altered in response to every prompt.
I think this sort of thing is highly likely for any model of this type with no private state, though not essential. It could have a conditional distribution of future stories that is highly variable in response to instructions about what the story should contain and yet completely insensitive to mere questions about it, but I think that’s a very unlikely type of model. Systems with private state are much more likely to be trainable to query that state and answer questions about it without changing much of the state. Doing the same with merely an enormously high dimensional implicit distribution seems too much of a balancing act for any training regimen to target.
Suppose we modify the thought experiment so that we ask the LLM to simplify both sides of the “pick a number between 1 and 100” / “ask yes/no questions about the number.” Now there is no new variable input from the user, but the yes/no questions still depend on random sampling. Would you now say that the LLM has chosen a number immediately after it prints out “Ready?”
Chosen a number: no (though it does at temperature zero).
Has something approximating a plan for how the ‘conversation’ will go (including which questions are most favoured at each step and go with which numbers), yes to some extent. I do think “plan” is a misleading word, though I don’t have anything better.
I think the realization I’m coming to is that folks on this thread have a shared understanding of the basic mechanics (we seem to be agreed on what computations are occurring, we don’t seem to be making any different predictions), and we are unsure about interpretation. Do you agree?
For myself, I continue to maintain that viewing the system as a next-word sampler is not misleading, and that saying it has a “plan” is misleading—but I try to err very on the side of not anthropomorphizing / not taking an intentional stance (I also try to avoid saying the system “knows” or “understands” anything). I do agree that the system’s activation cache contain a lot of information that collectively biases the next word predictor towards producing the output it produces; I see how someone might reasonably call that a “plan” although I choose not to.
FWIW, I’m not wedded to “plan.” And as for anthropomorphizing, there are many times when anthropomorphic phrasing is easier and more straightforward, so I don’t want to waste time trying to work around it with more complex phrasing. The fact is these devices are fundamentally new and we need to come up with new ways of talking about them. That’s going to take awhile.
Then wouldn’t you believe that in the case of my thought experiment, the number is also smeared through the parameter weights? Or maybe it’s merely the intent to pick a number later that’s smeared through the parameter weights?
Lots of things are smeared through the number weights.
I’ve prompted ChatGPT with “tell me a story” well over a dozen times, independently in separate sessions. On three occasions I’ve gotten a story with elements from “Jack and the beanstalk.” There’s the name, the beanstalk, and the giant. But the giant wasn’t blind and no “fee fi fo fum.” Why that story three times? I figure it’s more or less an arbitrary fact of history and that seems to be particularly salient for ChatGPT.
I believe this is a non-scientific question, similar in vein to philosophical zombie questions. Person A says “gpt did come up with a number by that point” and person b says “gpt did not come up with a number by that point”, but as long as it still outputs the correct responses after that point, neither person can be proven correct. This is why real-world scientific results of assessing these AI capabilities are way more informative than intuitive ideas of what they’re supposed to be able to do (even if they’re only programmed to predict the next word, it’s wrong to assume a priori that a next-word predictor is incapable of specific tasks, or declare these achievements to be “faked intelligence” when it gets it right).
@Bill Benzon: A thought experiment. Suppose you say to ChatGPT “Think of a number between 1 and 100, but don’t tell me what it is. When you’ve done so, say ‘Ready’ and nothing else. After that, I will ask you yes / no questions about the number, which you will answer truthfully.”
After ChatGPT says “Ready”, do you believe a number has been chosen? If so, do you also believe that whatever “yes / no” sequence of questions you ask, they will always be answered consistently with that choice? Put differently, you do not believe that the particular choice of questions you ask can influence what number was chosen?
FWIW, I believe that no number gets chosen when ChatGPT says “Ready,” that the number gets chosen during the questions (hopefully consistently) and that, starting ChatGPT from the same random seed and otherwise assuming deterministic execution, different sequences of questions or different temperatures or different random modifications to the “post-Ready seed” (this is vague but I assume comprehensible) could lead to different “chosen numbers.”
(The experiment is not-trivial to run since it requires running your LLM multiple times with the same seed or otherwise completely copying the state after the LLM replies “Ready.”)
This is a very interesting scenario, thank you for posting it!
I suspect that ChatGPT can’t even be relied upon to answer in a manner that is consistent with having chosen a number.
In principle a more capable LLM could answer consistently, but almost certainly won’t “choose a number” at the point of emitting “Ready” (even with temperature zero). The subsequent questions will almost certainly influence the final number, and I suspect this may be a fundamental limitation of this sort of architecture.
Very interesting. I suspect you are right about this:
But if I am right and ChatGPT isn’t choosing a number before it says “Ready,” why do you think that ChatGPT “has a plan?” Is the story situation crucially different in some way?
I think there is one difference: in the “write a story” case, the model subsequently generates the text without further variable input.
If the story is written in pieces with further variable prompting, I would agree that there is little sense in which it ‘has a plan’. To what extent that it could be said to have a plan, that plan is radically altered in response to every prompt.
I think this sort of thing is highly likely for any model of this type with no private state, though not essential. It could have a conditional distribution of future stories that is highly variable in response to instructions about what the story should contain and yet completely insensitive to mere questions about it, but I think that’s a very unlikely type of model. Systems with private state are much more likely to be trainable to query that state and answer questions about it without changing much of the state. Doing the same with merely an enormously high dimensional implicit distribution seems too much of a balancing act for any training regimen to target.
Suppose we modify the thought experiment so that we ask the LLM to simplify both sides of the “pick a number between 1 and 100” / “ask yes/no questions about the number.” Now there is no new variable input from the user, but the yes/no questions still depend on random sampling. Would you now say that the LLM has chosen a number immediately after it prints out “Ready?”
Chosen a number: no (though it does at temperature zero).
Has something approximating a plan for how the ‘conversation’ will go (including which questions are most favoured at each step and go with which numbers), yes to some extent. I do think “plan” is a misleading word, though I don’t have anything better.
Thank you, this is helpful.
I think the realization I’m coming to is that folks on this thread have a shared understanding of the basic mechanics (we seem to be agreed on what computations are occurring, we don’t seem to be making any different predictions), and we are unsure about interpretation. Do you agree?
For myself, I continue to maintain that viewing the system as a next-word sampler is not misleading, and that saying it has a “plan” is misleading—but I try to err very on the side of not anthropomorphizing / not taking an intentional stance (I also try to avoid saying the system “knows” or “understands” anything). I do agree that the system’s activation cache contain a lot of information that collectively biases the next word predictor towards producing the output it produces; I see how someone might reasonably call that a “plan” although I choose not to.
FWIW, I’m not wedded to “plan.” And as for anthropomorphizing, there are many times when anthropomorphic phrasing is easier and more straightforward, so I don’t want to waste time trying to work around it with more complex phrasing. The fact is these devices are fundamentally new and we need to come up with new ways of talking about them. That’s going to take awhile.
Read the comments I’ve posted earlier today. The plan is smeared through the parameter weights.
Then wouldn’t you believe that in the case of my thought experiment, the number is also smeared through the parameter weights? Or maybe it’s merely the intent to pick a number later that’s smeared through the parameter weights?
Lots of things are smeared through the number weights.
I’ve prompted ChatGPT with “tell me a story” well over a dozen times, independently in separate sessions. On three occasions I’ve gotten a story with elements from “Jack and the beanstalk.” There’s the name, the beanstalk, and the giant. But the giant wasn’t blind and no “fee fi fo fum.” Why that story three times? I figure it’s more or less an arbitrary fact of history and that seems to be particularly salient for ChatGPT.
I believe this is a non-scientific question, similar in vein to philosophical zombie questions. Person A says “gpt did come up with a number by that point” and person b says “gpt did not come up with a number by that point”, but as long as it still outputs the correct responses after that point, neither person can be proven correct. This is why real-world scientific results of assessing these AI capabilities are way more informative than intuitive ideas of what they’re supposed to be able to do (even if they’re only programmed to predict the next word, it’s wrong to assume a priori that a next-word predictor is incapable of specific tasks, or declare these achievements to be “faked intelligence” when it gets it right).