A useful thread on the bag of heuristics versus actual noisy algorithmic reasoning has some interesting results, and the results show that at least with COT added to LLMs, LLMs aren’t just a bag of heuristics, and do actual reasoning.
Of course, there is still pretty significant bag of heuristic reasoning, but I do think the literal claim that a bag of heuristics is all there is in LLMs is false.
You’ve claimed that it would be useful to think about the search/planning process as being implemented through heuristics, and I think this is sometimes true that some parts of search/planning are implemented through heuristics, but I don’t think that’s all there is to an LLM planning/searching, either now or in the future for LLMs
The paper “Auto-Regressive Next-Token Predictors are Universal Learners” made me a little more skeptical of attributing general reasoning ability to LLMs. They show that even linear predictive models, basically just linear regression, can technically perform any algorithm when used autoregressively like with chain-of-thought. The results aren’t that mind-blowing but it made me wonder whether performing certain algorithms correctly with a scratchpad is as much evidence of intelligence as I thought.
One man’s modus ponens is another man’s modus tollens, and what I do take away from the result is that intelligence with enough compute is too easy to do, so easy that even linear predictive models can do it in theory.
So they don’t disprove that intelligent/algorithmic reasoning isn’t happening in LLMs, but rather that it’s too easy to get intelligence/computation by many different methods.
It’s similar to the proof that an origami computer can compute every function computable by a Turing Machine, and if in a hypothetical world we were instead using very large origami pieces to build up AIs like AlphaGo, I don’t think that there would be a sense in which it’s obviously not reasoning about the game of Go.
I agree that origami AIs would still be intelligent if implementing the same computations. I was trying to point at LLMs potentially being ‘sphexish’: having behaviors made of baked if-then patterns linked together that superficially resemble ones designed on-the-fly for a purpose. I think this is related to what the “heuristic hypothesis” is getting at.
IMO, I think the heuristic hypothesis is partially right, but partially right is the keyword, in the sense that LLMs both will have sphexish heuristics and mostly clean algorithms for solving problems.
I also expect OpenAI to broadly move LLMs from more heuristic-like reasoning to algorithmic-like reasoning, and o1 is slight evidence towards more systematic reasoning in LLMs.
Thanks for the pointer! I skimmed the paper. Unless I’m making a major mistake in interpreting the results, the evidence they provide for “this model reasons” is essentially “the models are better at decoding words encrypted with rot-5 than they are at rot-10.” I don’t think this empirical fact provides much evidence one way or another.
To summarize, the authors decompose a model’s ability to decode shift ciphers (e.g., Rot-13 text: “fgnl” Original text: “stay”) into three categories, probability, memorization, and noisy reasoning.
Probability just refers to a somewhat unconditional probability that a model assigns to a token (specifically, ‘The word is “WORD”’). The model is more likely to decode words that are more likely a priori—this makes sense.
Memorization is defined as how often the type of rotational cipher shows up. rot-13 is the most common one by far, followed by rot-3. The model is better at decoding rot-13 ciphers more than any other cipher, which makes sense since there’s more of it in the training data, and the model probably has specialized circuitry for rot-13.
What they call “noisy reasoning” is how many rotations is needed to get to the outcome. According to the authors, the fact that GPT-4 does better on shift ciphers with fewer shifts compared to ciphers with more shifts is evidence of this “noisy reasoning.”
I don’t see how you can jump from this empirical result to make claims about the model’s ability to reason. For example, an alternative explanation is that the model has learned some set of heuristics that allows it to shift letters from one position to another, but this set of heuristics can only be combined in a limited manner.
Generally though, I think what constitutes as a “heuristic” is somewhat of a fuzzy concept. However, what constitutes as “reasoning” seems even less defined.
True that it isn’t much evidence for reasoning directly, as it’s only 1 task.
As for how we can jump from the empirical result to make claims about it’s ability to reason, the reason is that the shift cipher task let’s us disentangle commonness and simplicity, where a bag of heuristics that has no uniform and compact description work best for common example types, whereas the algorithmic reasoning that I defined below would work better on simpler tasks, where the simplest shift cipher is 1-shift cipher, whereas the bag of heuristics model which predicts that LLMs are essentially learning shallow heuristics completely or primarily would work best on 13-shift ciphers, as that’s the most common, and the paper shows that there is a spike on the 13-shift cipher accuracy, consistent with LLMs having some heuristics, but also that the 1-shift cipher accuracy was much better than expected under a view that though LLMs were solely or primarily a bag of heuristics that couldn’t be improved by COT.
I’m defining reasoning more formally in the quote below:
So an “algorithm” is a finite description of a fast parallel circuit for every size.
I see, I think that second tweet thread actually made a lot more sense, thanks for sharing! McCoy’s definitions of heuristics and reasoning is sensible, although I personally would still avoid “reasoning” as a word since people probably have very different interpretations of what it means. I like the ideas of “memorizing solutions” and “generalizing solutions.”
I think where McCoy and I depart is that he’s modeling the entire network computation as a heuristic, while I’m modeling the network as compositions of bags of heuristics, which in aggregate would display behaviors he would call “reasoning.”
The explanation I gave above—heuristics that shifts the letter forward by one with limited composing abilities—is still a heuristics-based explanation. Maybe this set of composing heuristics would fit your definition of an “algorithm.” I don’t think there’s anything inherently wrong with that.
However, the heuristics based explanation gives concrete predictions of what we can look for in the actual network—individual heuristic that increments a to b, b to c, etc., and other parts of the network that compose the outputs.
This is what I meant when I said that this could be a useful framework for interpretability :)
Though I’d still claim that this is evidence towards the view that there is a generalizing solution that is implemented inside of LLMs, and I wanted people to keep that in mind, since people often treat heuristics as meaning that it doesn’t generalize at all.
since people often treat heuristics as meaning that it doesn’t generalize at all.
Yeah and I think that’s a big issue! I feel like what’s happening is that once you chain a huge number of heuristics together you can get behaviors that look a lot like complex reasoning.
A useful thread on the bag of heuristics versus actual noisy algorithmic reasoning has some interesting results, and the results show that at least with COT added to LLMs, LLMs aren’t just a bag of heuristics, and do actual reasoning.
Of course, there is still pretty significant bag of heuristic reasoning, but I do think the literal claim that a bag of heuristics is all there is in LLMs is false.
You’ve claimed that it would be useful to think about the search/planning process as being implemented through heuristics, and I think this is sometimes true that some parts of search/planning are implemented through heuristics, but I don’t think that’s all there is to an LLM planning/searching, either now or in the future for LLMs
The thread is below:
https://x.com/aksh_555/status/1843326181950828753
The paper “Auto-Regressive Next-Token Predictors are Universal Learners” made me a little more skeptical of attributing general reasoning ability to LLMs. They show that even linear predictive models, basically just linear regression, can technically perform any algorithm when used autoregressively like with chain-of-thought. The results aren’t that mind-blowing but it made me wonder whether performing certain algorithms correctly with a scratchpad is as much evidence of intelligence as I thought.
One man’s modus ponens is another man’s modus tollens, and what I do take away from the result is that intelligence with enough compute is too easy to do, so easy that even linear predictive models can do it in theory.
So they don’t disprove that intelligent/algorithmic reasoning isn’t happening in LLMs, but rather that it’s too easy to get intelligence/computation by many different methods.
It’s similar to the proof that an origami computer can compute every function computable by a Turing Machine, and if in a hypothetical world we were instead using very large origami pieces to build up AIs like AlphaGo, I don’t think that there would be a sense in which it’s obviously not reasoning about the game of Go.
https://www.quantamagazine.org/how-to-build-an-origami-computer-20240130/
I agree that origami AIs would still be intelligent if implementing the same computations. I was trying to point at LLMs potentially being ‘sphexish’: having behaviors made of baked if-then patterns linked together that superficially resemble ones designed on-the-fly for a purpose. I think this is related to what the “heuristic hypothesis” is getting at.
IMO, I think the heuristic hypothesis is partially right, but partially right is the keyword, in the sense that LLMs both will have sphexish heuristics and mostly clean algorithms for solving problems.
I also expect OpenAI to broadly move LLMs from more heuristic-like reasoning to algorithmic-like reasoning, and o1 is slight evidence towards more systematic reasoning in LLMs.
Thanks for the pointer! I skimmed the paper. Unless I’m making a major mistake in interpreting the results, the evidence they provide for “this model reasons” is essentially “the models are better at decoding words encrypted with rot-5 than they are at rot-10.” I don’t think this empirical fact provides much evidence one way or another.
To summarize, the authors decompose a model’s ability to decode shift ciphers (e.g., Rot-13 text: “fgnl” Original text: “stay”) into three categories, probability, memorization, and noisy reasoning.
Probability just refers to a somewhat unconditional probability that a model assigns to a token (specifically, ‘The word is “WORD”’). The model is more likely to decode words that are more likely a priori—this makes sense.
Memorization is defined as how often the type of rotational cipher shows up. rot-13 is the most common one by far, followed by rot-3. The model is better at decoding rot-13 ciphers more than any other cipher, which makes sense since there’s more of it in the training data, and the model probably has specialized circuitry for rot-13.
What they call “noisy reasoning” is how many rotations is needed to get to the outcome. According to the authors, the fact that GPT-4 does better on shift ciphers with fewer shifts compared to ciphers with more shifts is evidence of this “noisy reasoning.”
I don’t see how you can jump from this empirical result to make claims about the model’s ability to reason. For example, an alternative explanation is that the model has learned some set of heuristics that allows it to shift letters from one position to another, but this set of heuristics can only be combined in a limited manner.
Generally though, I think what constitutes as a “heuristic” is somewhat of a fuzzy concept. However, what constitutes as “reasoning” seems even less defined.
True that it isn’t much evidence for reasoning directly, as it’s only 1 task.
As for how we can jump from the empirical result to make claims about it’s ability to reason, the reason is that the shift cipher task let’s us disentangle commonness and simplicity, where a bag of heuristics that has no uniform and compact description work best for common example types, whereas the algorithmic reasoning that I defined below would work better on simpler tasks, where the simplest shift cipher is 1-shift cipher, whereas the bag of heuristics model which predicts that LLMs are essentially learning shallow heuristics completely or primarily would work best on 13-shift ciphers, as that’s the most common, and the paper shows that there is a spike on the 13-shift cipher accuracy, consistent with LLMs having some heuristics, but also that the 1-shift cipher accuracy was much better than expected under a view that though LLMs were solely or primarily a bag of heuristics that couldn’t be improved by COT.
I’m defining reasoning more formally in the quote below:
This comment is where I got the quote from:
https://www.lesswrong.com/posts/gcpNuEZnxAPayaKBY/othellogpt-learned-a-bag-of-heuristics-1#Bg5s8ujitFvfXuop8
This thread has an explanation of why we can disentangle noisy reasoning from heuristics, as I’m defining the terms here, so go check that out below:
https://x.com/RTomMcCoy/status/1843325666231755174
I see, I think that second tweet thread actually made a lot more sense, thanks for sharing!
McCoy’s definitions of heuristics and reasoning is sensible, although I personally would still avoid “reasoning” as a word since people probably have very different interpretations of what it means. I like the ideas of “memorizing solutions” and “generalizing solutions.”
I think where McCoy and I depart is that he’s modeling the entire network computation as a heuristic, while I’m modeling the network as compositions of bags of heuristics, which in aggregate would display behaviors he would call “reasoning.”
The explanation I gave above—heuristics that shifts the letter forward by one with limited composing abilities—is still a heuristics-based explanation. Maybe this set of composing heuristics would fit your definition of an “algorithm.” I don’t think there’s anything inherently wrong with that.
However, the heuristics based explanation gives concrete predictions of what we can look for in the actual network—individual heuristic that increments a to b, b to c, etc., and other parts of the network that compose the outputs.
This is what I meant when I said that this could be a useful framework for interpretability :)
Now I understand.
Though I’d still claim that this is evidence towards the view that there is a generalizing solution that is implemented inside of LLMs, and I wanted people to keep that in mind, since people often treat heuristics as meaning that it doesn’t generalize at all.
Yeah and I think that’s a big issue! I feel like what’s happening is that once you chain a huge number of heuristics together you can get behaviors that look a lot like complex reasoning.