It isn’t only the training process that limits a model’s ability to (for instance) simulate conscious minds, but also the structure of the model itself. For instance, I bet there is literally no possible training that would make GPT-3 do that, because whatever weights you put in it it isn’t doing a kind of computation that’s capable of simulating conscious minds. But I wouldn’t bet that much at very long odds; my reason for thinking this is that each token-prediction does a computation with not that many steps to it, and it doesn’t seem as if there’s “room” there for anything so exciting; but maaaaaybe there’s some weight vector for the GPT-3 network that, when you give it the right prompt, emits a lengthy “internal monologue” and in the process does simulate a conscious mind.
Actually, here’s a kinda related question. Is the transformer architecture Turing-complete in the sense that some plausible actual transformer network like GPT-3′s, with some possible set of weights and a suitable prompt, will reliably-enough simulate an arbitrary Turing machine for an arbitrary number of steps? No, because the network and its input window are both finite, so there is only a finite number of states it can be in. And maybe there’s a related handwavy argument that any conscious-mind simulation needs too large a repertoire of possible states?
Related related question. Suppose you undertake, whenever your transformer’s output contains “*** READ ADDRESS n ***” or “*** WRITE ADDRESS n ***”, with n a non-negative integer in decimal notation, to stop its token-output at that point and give a new prompt that (in the former case) consists of an integer equal to the last thing it tried to WRITE at ADDRESS n (if any; any value will do, if it never did) and (in the latter case) consists of just “Done”. Is the transformer architecture, so augmented, Turing-complete? Is there some training process that would teach it to exploit this “external memory” effectively?
A Turing machine is a finite automaton that has access to sufficient space for notes. A Turing machine with a very small finite automaton can simulate an arbitrary program if the program is already written down in the notes. A Turing machine with a large finite automaton can simulate a large program out of the box. ML models can obviously act like finite automata. So they are all Turing complete, if given access to enough space for making notes, possibly with initialization notes containing a large program.
This is not at all helpful, because normal training won’t produce interesting finite automata, not unless it learns from appropriate data, which is only straightforward to generate if the target finite automaton is already known. Also, even short term human memory already acts like ML models and not deliberative examination of written notes, so an LLM-based agent would need to reason in an unusual and roundabout way if it doesn’t have a better architecture that continually learns from observations (and thus makes external notes unnecessary). Internal monologue is still necessary to produce complicated conclusions, but that could just be normal output wrapped in silencing tags.
I’m not sure how obvious it is that “ML models can act like finite automata”. I mean, there are theorems that say things like “a large enough multi-layer perceptron can approximate any function arbitrarily well”, and unless I’m being dim those do indeed indicate that for such a model there exist weights that make it implement a universal Turing machine, but I don’t think that means that e.g. such weights exist that make a transformer of “reasonable” size do that. (Though, on reflection, I think I agree that we should expect that they do.) Your comment about normal training not doing that was rather the point of my final question.
Right, I don’t know how much data a model stores, and how much of that can be reached through retraining, if all parameters can’t be specified outright. If the translation is bad enough it couldn’t quote an LLM and memorize its parameters as explicitly accessible raw data using a model of comparable size. Still, an LLM trained on actual language could probably get quite a lot smaller after some lossy compression (that I have no idea how to specify), and it would also take eons to decode from the model (by doing experiments on it to elicit its behavior). So size bounds are not the most practical concern here. But maybe the memorized data could be written down much faster with a reasonable increase in model size?
Hmm, there might be relevant limitations based on the structure of the model, but those limitations seem to be peculiar to the model under consideration. They don’t seem to generalise to arbitrary systems selected for minimising predictive loss on text prediction.
That is, I don’t think they’re a fundamental limitation of language models, and it was the limits of language models I mostly wanted to explore in this post.
1. I was commenting on your “Moreover, the diversity and comprehensiveness of the dataset a language model is trained on will limit the capabilities it can actually attain in deployment. I.e. that a particular upper bound exists in principle, does not mean it will be realised in practice.”: I think that in practice what’s realisable will be limited at least as much by the structure of the model as by how it’s trained. So it’s not just “no matter how fancy a model we build, some plausible training methods will not enable it to do this” but also “no matter how fancy a training method we use, some plausible architectures will not be able to do this”, and that seemed worth making explicit.
2. In between “current versions of GPT” and “absolutely anything that is in some sense trying to predict text” it seems like there’s an interesting category of “things with the same general sort of structure as current LLMs but maybe trained differently”.
(I worry a little that a definition of “language model” much less restrictive than that may end up including literally everything capable of using language, including us and hypothetical AGIs specifically designed to be AGIs.)
“no matter how fancy a training method we use, some plausible architectures will not be able to do this”, and that seemed worth making explicit.
Fair enough. I’ll try and add a fragment to the post making this argument (at a high level of generality, I’m too ignorant about LLM architecture details to describe such limitations in concrete terms).
(I worry a little that a definition of “language model” much less restrictive than that may end up including literally everything capable of using language, including us and hypothetical AGIs specifically designed to be AGIs.)
I’m using “language model” here to refer to systems optimised solely for the task of predicting text.
It isn’t only the training process that limits a model’s ability to (for instance) simulate conscious minds, but also the structure of the model itself. For instance, I bet there is literally no possible training that would make GPT-3 do that, because whatever weights you put in it it isn’t doing a kind of computation that’s capable of simulating conscious minds. But I wouldn’t bet that much at very long odds; my reason for thinking this is that each token-prediction does a computation with not that many steps to it, and it doesn’t seem as if there’s “room” there for anything so exciting; but maaaaaybe there’s some weight vector for the GPT-3 network that, when you give it the right prompt, emits a lengthy “internal monologue” and in the process does simulate a conscious mind.
Actually, here’s a kinda related question. Is the transformer architecture Turing-complete in the sense that some plausible actual transformer network like GPT-3′s, with some possible set of weights and a suitable prompt, will reliably-enough simulate an arbitrary Turing machine for an arbitrary number of steps? No, because the network and its input window are both finite, so there is only a finite number of states it can be in. And maybe there’s a related handwavy argument that any conscious-mind simulation needs too large a repertoire of possible states?
Related related question. Suppose you undertake, whenever your transformer’s output contains “*** READ ADDRESS n ***” or “*** WRITE ADDRESS n ***”, with n a non-negative integer in decimal notation, to stop its token-output at that point and give a new prompt that (in the former case) consists of an integer equal to the last thing it tried to WRITE at ADDRESS n (if any; any value will do, if it never did) and (in the latter case) consists of just “Done”. Is the transformer architecture, so augmented, Turing-complete? Is there some training process that would teach it to exploit this “external memory” effectively?
A Turing machine is a finite automaton that has access to sufficient space for notes. A Turing machine with a very small finite automaton can simulate an arbitrary program if the program is already written down in the notes. A Turing machine with a large finite automaton can simulate a large program out of the box. ML models can obviously act like finite automata. So they are all Turing complete, if given access to enough space for making notes, possibly with initialization notes containing a large program.
This is not at all helpful, because normal training won’t produce interesting finite automata, not unless it learns from appropriate data, which is only straightforward to generate if the target finite automaton is already known. Also, even short term human memory already acts like ML models and not deliberative examination of written notes, so an LLM-based agent would need to reason in an unusual and roundabout way if it doesn’t have a better architecture that continually learns from observations (and thus makes external notes unnecessary). Internal monologue is still necessary to produce complicated conclusions, but that could just be normal output wrapped in silencing tags.
I’m not sure how obvious it is that “ML models can act like finite automata”. I mean, there are theorems that say things like “a large enough multi-layer perceptron can approximate any function arbitrarily well”, and unless I’m being dim those do indeed indicate that for such a model there exist weights that make it implement a universal Turing machine, but I don’t think that means that e.g. such weights exist that make a transformer of “reasonable” size do that. (Though, on reflection, I think I agree that we should expect that they do.) Your comment about normal training not doing that was rather the point of my final question.
Right, I don’t know how much data a model stores, and how much of that can be reached through retraining, if all parameters can’t be specified outright. If the translation is bad enough it couldn’t quote an LLM and memorize its parameters as explicitly accessible raw data using a model of comparable size. Still, an LLM trained on actual language could probably get quite a lot smaller after some lossy compression (that I have no idea how to specify), and it would also take eons to decode from the model (by doing experiments on it to elicit its behavior). So size bounds are not the most practical concern here. But maybe the memorized data could be written down much faster with a reasonable increase in model size?
Hmm, there might be relevant limitations based on the structure of the model, but those limitations seem to be peculiar to the model under consideration. They don’t seem to generalise to arbitrary systems selected for minimising predictive loss on text prediction.
That is, I don’t think they’re a fundamental limitation of language models, and it was the limits of language models I mostly wanted to explore in this post.
Agreed. But:
1. I was commenting on your “Moreover, the diversity and comprehensiveness of the dataset a language model is trained on will limit the capabilities it can actually attain in deployment. I.e. that a particular upper bound exists in principle, does not mean it will be realised in practice.”: I think that in practice what’s realisable will be limited at least as much by the structure of the model as by how it’s trained. So it’s not just “no matter how fancy a model we build, some plausible training methods will not enable it to do this” but also “no matter how fancy a training method we use, some plausible architectures will not be able to do this”, and that seemed worth making explicit.
2. In between “current versions of GPT” and “absolutely anything that is in some sense trying to predict text” it seems like there’s an interesting category of “things with the same general sort of structure as current LLMs but maybe trained differently”.
(I worry a little that a definition of “language model” much less restrictive than that may end up including literally everything capable of using language, including us and hypothetical AGIs specifically designed to be AGIs.)
Fair enough. I’ll try and add a fragment to the post making this argument (at a high level of generality, I’m too ignorant about LLM architecture details to describe such limitations in concrete terms).
I’m using “language model” here to refer to systems optimised solely for the task of predicting text.