“Why would LLMs ever learn to distill inner-monologues into forward passes? Why speculate about emergent communication or models in slow feedback loops through corpuses? Do you have any proof for any of this speculation about short outputs being incentivized to do as much computation covertly ‘outside’ the human-readable text of the inner-monologue?”
To augment language models with the ability to reason, researchers usually prompt or finetune them to produce chain of thought reasoning steps before producing the final answer. However, although people use natural language to reason effectively, it may be that LMs could reason more effectively with some intermediate computation that is not in natural language. In this work, we explore an alternative reasoning approach: instead of explicitly producing the chain of thought reasoning steps, we use the language model’s internal hidden states to perform implicit reasoning. The implicit reasoning steps are distilled from a teacher model trained on explicit chain-of-thought reasoning, and instead of doing reasoning “horizontally” by producing intermediate words one-by-one, we distill it such that the reasoning happens “vertically” among the hidden states in different layers. We conduct experiments on a multi-digit multiplication task and a grade school math problem dataset and find that this approach enables solving tasks previously not solvable without explicit chain-of-thought, at a speed comparable to no chain-of-thought.
...Our experiments show the potential of implicit chain-of-thought reasoning. On a synthetic multi-digit multiplication task, we found that while standard training cannot yield the final answer without explicit reasoning (even GPT-4 struggles with five-digit by five-digit multiplication), our method, applied to a GPT-2 Medium model, is able to provide direct answers for up to five-digit by five-digit multiplications. Moreover, when dealing with real-world tasks like grade school math problems, our method achieves a 22% accuracy on GSM8k (Cobbe et al., 2021) without the need for explicitly generating the intermediate steps.
Thanks for the heads up. The problem is that this is going to make.them more computationally efficient at runtime, so it will be tempting for everyone to do this.
“Why would LLMs ever learn to distill inner-monologues into forward passes? Why speculate about emergent communication or models in slow feedback loops through corpuses? Do you have any proof for any of this speculation about short outputs being incentivized to do as much computation covertly ‘outside’ the human-readable text of the inner-monologue?”
“Because we will train them to.”
“Look you can’t just handwave incentives or convergent instrumental drives—wait what?”
“Because we’ll train them to.”
“Implicit Chain of Thought Reasoning via Knowledge Distillation”, Deng et al 2023:
Goddammit people.
Thanks for the heads up. The problem is that this is going to make.them more computationally efficient at runtime, so it will be tempting for everyone to do this.