If two tasks reduce to one another, then it is meaningless to ask if a machine is ‘really doing’ one task versus the other.
It is rare that two tasks exactly reduce to one another. When there’s only a partial reduction between two tasks X and Y, it can be genuinely helpful to distinguish “doing X” from “doing Y”, because this lossy mapping causes the tails to come apart, such that one mental model extrapolates correctly and the other fails to do so. To the extent that we care about making high-confidence predictions in situations that are significantly out of distribution, or where the stakes are high, this can matter a whole lot.
Sure, every abstraction is leaky and if we move to extreme regimes then the abstraction will become leakier and leakier.
Does my desktop multiply matrices? Well, not when it’s in the corona of the sun. And it can’t add 10^200-digit numbers.
So what do we mean when we say “this desktop multiplies two matrices”?
We mean that in the range of normal physical environments (air pressure, room temperature, etc), the physical dynamics of the desktop corresponds to matrix multiplication with respect to some conventional encoding o small matrices into the physical states of the desktop
By adding similar disclaimers, I can say “this desktop writes music” or “this desktop recognises dogs”.
AFAICT the bit where there’s substantive disagreement is always in the middle regime, not the super-close extreme or the super-far extreme. This is definitely where I feel like debates over the use of frames like simulator theory are.
For example, is the Godot game engine a light transport simulator? In certain respects Godot captures the typical overall appearance of a scene, in a subset of situations. But it actually makes a bunch of weird simplifications and shortcuts under the hood that don’t correspond to any real dynamics. That’s because it isn’t trying to simulate the underlying dynamics of light, it’s trying to reproduce certain broad-strokes visible patterns that light produces.
That difference really matters! If you wanna make reliable and high-fidelity predictions about light transport, or if you wanna know what a scene that has a bunch of weird reflective and translucent materials looks like, you may get more predictive mileage thinking about the actual generating equations (or using a physically-based renderer, which does so for you), rather than treating Godot as a “light transport simulator” in this context. Otherwise you’ve gotta maintain a bunch of special-casing in your reasoning to keep maintaining the illusion.
We have a particular autoregressive language model μ:Tk→Δ(T), and Simulator Theory says that μ is simulating a whole series of simulacra which are consistent with the prompt.
where μs is the stochastic process corresponding to a simulacrum s∈S.
Now, there are two objections to this:
Firstly, is it actually true that μ has this particular structure?
Secondly, even if it were true, why are we warranted in saying that GPT is simulating all these simulacra?
The first objection is a purely technical question, whereas the second is conceptual. In this article, I present a criterion which partially answers the second objection.
Note that the first objection — is it actually true that μ has this particular structure? — is a question about a particular autoregressive language model. You might give one answer for GPT-2 and a different answer for GPT-4.
I’m confused what you mean to claim. Understood that a language model factorizes the joint distribution over tokens autoregessively, into the product of next-token distributions conditioned on their prefixes. Also understood that it is possible to instead factorize the joint distribution over tokens into a conditional distribution over tokens conditioned on a latent variable (call it s) weighted by the prior over s. These are claims about possible factorizations of a distribution, and about which factorization the language model uses.
What are you claiming beyond that?
Are you claiming something about the internal structure of the language model?
Are you claiming something about the structure of the true distribution over tokens?
Are you claiming something about the structure of the generative process that produces the true distribution over tokens?
Are you claiming something about the structure of the world more broadly?
Are you claiming something about correspondences between the above?
It is rare that two tasks exactly reduce to one another. When there’s only a partial reduction between two tasks X and Y, it can be genuinely helpful to distinguish “doing X” from “doing Y”, because this lossy mapping causes the tails to come apart, such that one mental model extrapolates correctly and the other fails to do so. To the extent that we care about making high-confidence predictions in situations that are significantly out of distribution, or where the stakes are high, this can matter a whole lot.
Sure, every abstraction is leaky and if we move to extreme regimes then the abstraction will become leakier and leakier.
Does my desktop multiply matrices? Well, not when it’s in the corona of the sun. And it can’t add 10^200-digit numbers.
So what do we mean when we say “this desktop multiplies two matrices”?
We mean that in the range of normal physical environments (air pressure, room temperature, etc), the physical dynamics of the desktop corresponds to matrix multiplication with respect to some conventional encoding o small matrices into the physical states of the desktop
By adding similar disclaimers, I can say “this desktop writes music” or “this desktop recognises dogs”.
AFAICT the bit where there’s substantive disagreement is always in the middle regime, not the super-close extreme or the super-far extreme. This is definitely where I feel like debates over the use of frames like simulator theory are.
For example, is the Godot game engine a light transport simulator? In certain respects Godot captures the typical overall appearance of a scene, in a subset of situations. But it actually makes a bunch of weird simplifications and shortcuts under the hood that don’t correspond to any real dynamics. That’s because it isn’t trying to simulate the underlying dynamics of light, it’s trying to reproduce certain broad-strokes visible patterns that light produces.
That difference really matters! If you wanna make reliable and high-fidelity predictions about light transport, or if you wanna know what a scene that has a bunch of weird reflective and translucent materials looks like, you may get more predictive mileage thinking about the actual generating equations (or using a physically-based renderer, which does so for you), rather than treating Godot as a “light transport simulator” in this context. Otherwise you’ve gotta maintain a bunch of special-casing in your reasoning to keep maintaining the illusion.
Let’s take LLM Simulator Theory.
We have a particular autoregressive language model μ:Tk→Δ(T), and Simulator Theory says that μ is simulating a whole series of simulacra which are consistent with the prompt.
Formally speaking,
μ(tk+1|t1,…,tk)=1P(t1,…,tk)∑s∈SP(S)×μs(t1,…,tk)×μs(tk+1|t1,…,tk)
where μs is the stochastic process corresponding to a simulacrum s∈S.
Now, there are two objections to this:
Firstly, is it actually true that μ has this particular structure?
Secondly, even if it were true, why are we warranted in saying that GPT is simulating all these simulacra?
The first objection is a purely technical question, whereas the second is conceptual. In this article, I present a criterion which partially answers the second objection.
Note that the first objection — is it actually true that μ has this particular structure? — is a question about a particular autoregressive language model. You might give one answer for GPT-2 and a different answer for GPT-4.
I’m confused what you mean to claim. Understood that a language model factorizes the joint distribution over tokens autoregessively, into the product of next-token distributions conditioned on their prefixes. Also understood that it is possible to instead factorize the joint distribution over tokens into a conditional distribution over tokens conditioned on a latent variable (call it
s
) weighted by the prior overs
. These are claims about possible factorizations of a distribution, and about which factorization the language model uses.What are you claiming beyond that?
Are you claiming something about the internal structure of the language model?
Are you claiming something about the structure of the true distribution over tokens?
Are you claiming something about the structure of the generative process that produces the true distribution over tokens?
Are you claiming something about the structure of the world more broadly?
Are you claiming something about correspondences between the above?