Are LLMs on the Path to AGI?
I am unsure, but I disagree with one argument that they aren’t.
There’s a joke about how humans have gotten so good at thinking that they tricked rocks into thinking for them. But it’s a joke, in part because it’s funny to say that computers work by “tricking rocks into thinking,” and in part because what computers do isn’t “really” thinking.
But it is possible to take the limitations of computers and computation too far. A point I’ve repeatedly seen is that “Artificial General Intelligence lies beyond Deep Learning,” which gets something fundamentally but very subtly wrong about Large Language Models. The overall claim is that machine learning is fundamentally incapable of certain types of reasoning required for AGI. Whether that is true is fundamentally unclear, and I think the proponents of this view are substantively wrong in repeating the common claim that deep learning cannot do counterfactual reasoning.
First, though, I want to provide a bit of background to be clear about what computers are and are not doing. There is a deep question about whether LLMs understand anything, but I will claim that it’s irrelevant, because they don’t need to. Silicon and electrical waves inside of a calculator certainly do not “understand” numbers. It might be objected that if the circuits and logic gates aren’t doing math, so what calculators do isn’t truly math. When we put them together correctly, however, they can do addition anyways, without the logic gates and circuits understanding what they are doing. It can’t “truly” do math—and yet, e pur si muove! Calculators do not “truly understand” numbers, but that doesn’t mean we cannot build something on top of electronic circuits to do addition.
To analogize briefly, cells in the human brain also don’t know how to think, they just send electrical signals based on chemical gradients inside and outside the cell, and chemical signals. Clearly, the thinking happens at a different level than the sodium-potassium pumps or the neurons firing. That doesn’t mean human brains cannot represent numbers or do math, just that it happens at a different level than the neurons firing. But these philosophical questions aren’t actually answering anything. So I’ll abandon the analogies and get to the limitations of deep learning.
Machine learning models derive statistical rules based only on observational data. For this reason, the models cannot “learn” causal relationships. So the idea that deep learning systems focus on prediction, not (causal) understanding is at best narrowly correct1. However, to keep it simple, it is true that the representation of the data in the model isn’t a causal one—language models are not designed to have a causal understanding of the relationship between the input text and the completions, and purely textual relationships that are learned are correlational.
But the things which a model represents or understands are different from the things it outputs. A toy example might clarify this; if I perform a linear regression about the relationship between height and basketball points scored, the model does not understand what height or basketball are, but it outputs predictions about their relationship. That is, there is a difference between what the linear model represents, much less what it understands, and what it can do. Similarly, the things that language models can output are different from what they actually do internally.
So to return to the claim that deep learning systems won’t properly extend to what-if scenario evaluation instead of prediction—or the broader claim which has been made elsewhere that they can’t do causal reasoning, there are several places where I think this is misleading.
First, there is an idea that because models only represent the data they are given, they cannot extrapolate. The example given is that a self-driving car, “encountering a new situation for which it lacked training,” would inevitably fail. This is obviously wrong; even in the case of our linear model, the model extrapolates to new cases; the data may only contain heights between 4”9’ and 5”5’, and those between 5”7’ and 6”2’, but it can still provide a perfectly reasonable prediction interval for someone who is 5”6’, or even for people with heights of 6”4, despite never having seen that data. Of course, that example is simplistic, but it’s very easy to see that LLMs are in fact generalizing. The poetry they write is sometimes remixed, but it’s certainly novel. The answers it gives and the code it generates are sometimes simple reformulations of things it has seen, but they aren’t identical.
Second, the inability to learn causality from observation is both correct and incorrect. It is correct that a language model cannot properly infer causality in its data without counterfactuals, but it does not need to properly represent causality internally in order to output causally correct claims and understanding. Just like the earlier linear regression does not need to understand basketball, the LLM does not need to internally represent a correct understand of causality. That is, it can learn about how to reason about causal phenomenon in the real world by building purely correlational models of when to have outputs which reason causally. And we see this is the case!
The counterfactual reasoning here does not itself imply that there is anywhere inside of GPT4 which does causal reasoning—it provides essentially no evidence either way. It simply shows that the system has learned when to talk about casual relationships based on learning the statistical pattern in the data. Stochastic parrots can reason causally, even if they don’t understand what they are saying.
Third, this has nothing to do with LLM consciousness, and there is a philosophical case which has been made2 that language models cannot truly understand anything. That is, the outputs they produce no more represent understanding than a calculator’s output shows an understanding of mathematics. But this itself does not imply that it does not do the tasks correctly—this is an empirical rather than philosophical question! And as always, I do not think that the current generation of LLMs is actually generally intelligent in the sense that it can actually reason in novel situations, or can accomplish everything a human can do. But this isn’t evidence that LLMs are fundamentally incapable of doing so—especially once the LLM is integrated into a system which does more than output a single string, without iteration.
But to the extent that an LLM doing single-shot inference does, in fact, reason properly, the claim that AGI requires what-if or counterfactual or causal reasoning is not relevant, because we know that they do exactly that type of reasoning, whether or not it’s “true” understanding.
As a final note, in discussing deep uncertainty and robust decision-making, there is a claim that “a human would… update information continuously, and opt for a robust decision drawn from a “distribution” of actions that proved effective in previous analogous situations.” Unfortunately, that isn’t how humans reason; recognition primed decision making, where people choose actions based directly on their past experiences, doesn’t work that way. It does not opt for a robust decision. Instead, humans need to do extensive thinking and reflection in order to engage in robust decision making—and there seems to be no reason that LLMs could not do the same types of analysis, even if these systems don’t truly “understand” it. And if you ask an LLM to carefully reason through approaches and evaluate them via considering robustness to different uncertainties, it does a credible job.
Footnotes:
The use of gradient descent on model weights cannot learn to represent counterfactuals, and because what Pearl calls “do” operations are not represented, the high-dimensional functions which the model learns are correlations, not causal relationships. But given the data which contains counterfactuals, often with causality explicitly incorporated, the networks can, in theory, learn something equivalent to causal Bayesian networks or other causal representations of the data.
I’ll note that I think the typical philosophical case against LLM consciousness goes too far, in that it seems to prove human minds also cannot truly understand—but that’s a different discussion!
Thanks for the post, I agree with the main points.
There is another claim on causality one could make, which would be: LLMs cannot reliably act in the world as robust agents since by acting in the world, you change the world, leading to a distributional shift from the correlational data the LLM encountered during training.
I think that argument is correct, but misses an obvious solution: once you let your LLM act in the world, simply let it predict and learn from the tokens that it receives in response. Then suddenly, the LLM does not model correlational, but actual causal relationships.
I certainly agree with your counterargument. I’m surprised there’s so much skepticism that LLMs can lead to AGI. It’s like this group of futurists doesn’t think in terms of ongoing progress.
The recent post LLM Generality is a Timeline Crux addresses this question and generated some good discussion, much of it skeptical.
The author nonetheless predicts around 60% chance that LLMs lead to AGI, and 80% chance they do if they have scaffolding to turn them into language model cognitive architectures. My argument for how those likely lead quickly to AGI is here.