Arguments 6-10 seem like the most interesting ones (as they respond more directly to the argument). But for all of them except argument 6, it seems like the same argument would imply that humans would not be generally intelligent.
[Argument 6]
The kinds of AI systems that we are worried about are the kinds of systems that can do original scientific research and autonomously form plans for taking over the world. LLMs are trained to write text that would be maximally unsurprising if found on the internet. These two things are fundamentally not the same thing. Why, exactly, would we expect that a system that is good at the latter necessarily would be able to do the former?
Because text on the Internet sometimes involves people using logic, reasoning, hypothesis generation, analyzing experimental evidence, etc, and so plausibly the simplest program that successfully predicts that text would do so by replicating that logic, reasoning etc, which you could then chain together to make scientific progress.
What does the argument say in response?
[Argument 7]
This, in turn, suggests a data structure that is discrete and combinatorial, with syntax trees, etc, and neural networks do (according to the argument) not use such representations.
How do you know neural networks won’t use such representations? What is true of human brains but not of neural networks such that human brains can do this but neural networks can’t?
(Particularly interested in this one since you said you found it compelling.)
[Argument 8]
However, neural networks do typically not have this ability, with most neural networks [...] instead being more analogous to Boolean circuits.
What is true of human brains but not neural networks such that human brains can represent programs but neural networks can’t?
(I’d note that I’m including chain-of-thought as a way that neural networks can run programs.)
[Argument 9]
However, the fact that GPT-3 can eg play chess, but not solve a verbally described maze, is evidence that it relies on memorisation as well.
I would bet that you can play chess, but you cannot fold a protein (even if the rules for protein were verbally described to you). What’s the difference?
[Argument 10]
If we instead try to train them continuously, then we run into the problem of catastrophic forgetting, which we currently do not know how to solve.
Why doesn’t this apply to humans as well? We forget stuff all the time.
But for all of them except argument 6, it seems like the same argument would imply that humans would not be generally intelligent.
Why is that?
Because text on the Internet sometimes involves people using logic, reasoning, hypothesis generation, analyzing experimental evidence, etc, and so plausibly the simplest program that successfully predicts that text would do so by replicating that logic, reasoning etc, which you could then chain together to make scientific progress.
What does the argument say in response?
There are a few ways to respond.
First of all, what comes after “plausibly” could just turn out to be wrong. Many people thought human-level chess would require human-like strategising, but this turned out to be wrong (though the case for text prediction is certainly more convincing).
Secondly, an LLM is almost certainly not learning the lowest K-complexity program for text prediction, and given that, the case becomes less clear. For example, suppose an LLM instead learns a truly massive ensemble of simple heuristics, that together produce human-like text. It seems plausible that such an ensemble could produce convincing results, but without replicating logic, reasoning, and etc. IBM-Watson did something along these lines. Studies such as this one also provide some evidence for this perspective.
To give an intuition pump, suppose we trained an extremely large random forest classifier on the same data as GPT3 was trained on. How good would the output of this classifier be? While it would probably not be as good as GPT3, it would probably still be very impressive. And a random forest classifier is also a universal function approximator, whose performance keeps improving as it is given more training data. I’m sure there are scaling laws for them. But I don’t think many people believe that we could get AGI by making a sufficiently big random forest classifier for next-token prediction. Why is that? I have found this to be an interesting prompt to think about. For me, a gestalt shift that makes long time lines seem plausible is to look at LLMs sort of like how you would look at a giant random forest classifier.
(Also, just to reiterate, I am not personally convinced of long time-lines, I am just trying to make the best arguments for this view more easily available.)
How do you know neural networks won’t use such representations?
I can’t say this for sure, especially not for newer or more exotic architectures, but it does certainly not seem like these are the kinds of representations that deep learning systems are likely to learn. Rather, they seems much more likely to learn manifold-like representations, where proximity corresponds to relevant similarity, or something along those lines. Syntactically organised, combinatorial representations are certainly not very “native” to the deep learning paradigm.
It is worth clarifying that neural networks of course in principle could implement these representations, at least in the same sense as how a Boolean network can implement a Turing machine. The question is if they in practice can learn such representations in a reasonable way. Consider the example I gave with how an MLP can’t learn an identity function, unless the training data essentially forces it to memorise one. The question is whether or not a similar thing is true of LoT-style representations. Can you think of a natural way to represent a LoT in a vector space, that a neural network might plausibly learn, without being “forced” by the training data?
As an extremely simple example, a CNN and an MLP will in practice not learn the same kinds of representations, even though the CNN model space is contained in the MLP model space (if you make them wide enough). How do I know that an MLP won’t learn a CNN-like representation? Because these representations are not “natural” to MLPs, and the MLP will not be explicitly incentivised to learn them. My sense is that most deep learning systems are inclined away from LoT-like representations for similar reasons.
What is true of human brains but not of neural networks such that human brains can do this but neural networks can’t?
A human brain is not a tabula rasa system trained by gradient descent. I don’t know how a human brain is organised, what learning algorithms are used, or what parts are learnt as opposed to innate, etc, but it does not seem as though it works in the same way as a deep learning system.
What is true of human brains but not neural networks such that human brains can represent programs but neural networks can’t?
(I’d note that I’m including chain-of-thought as a way that neural networks can run programs.)
Here I will again just say that a human brain isn’t a tabula rasa system trained by gradient descent, so it is not inherently surprising for one of the two to have a property that the other one does not.
Chain-of-thought and attention mechanisms do certainly do seem to bring deep learning systems much closer to the ability to reason in terms of variables. Whether or not it is sufficient, I do not know.
I would bet that you can play chess, but you cannot fold a protein (even if the rules for protein were verbally described to you). What’s the difference?
Why wouldn’t I be able to fold a protein? At least if the size of the relevant state space is similar to that of eg chess.
(Also, to be clear, GPT-3 struggles with verbally described mazes with as few as ~5 states.)
Why doesn’t this apply to humans as well? We forget stuff all the time.
The argument would have to be that humans are more strategic with what to remember, and what to forget.
Meta: A lot of this seems to have the following form:
You: Here is an argument that neural networks have property X.
Me: But that argument as you’ve stated it would imply that humans have property X, which is false.
You: Humans and neural networks work differently, so it wouldn’t be surprising if neural networks have property X and humans don’t.
I think you are misunderstanding what I am trying to do here. I’m not trying to claim that humans and neural networks will have the same properties or be identical. I’m trying to evaluate how much I should update on the particular argument you have provided. The general rule I’m following is “if the argument would say false things about humans, then don’t update on it”. It may in fact be the case that humans and neural networks differ on that property, but if so it will be for some other reason. There is a general catchall category of “maybe something I didn’t think about makes humans and neural networks different on this property”, and indeed I even assign it decently high probability, but that doesn’t affect how much I should update on this particular argument.
Responding to particular pieces:
Why is that?
The rest of the comment was justifying that.
Studies such as this one also provide some evidence for this perspective.
I’m not seeing why that’s evidence for the perspective. Even when word order is scrambled, if you see “= 32 44 +” and you have to predict the remaining number, you should predict some combination of 76, 12, and −12 to get optimal performance; to do that you need to be able to add and subtract, so the model presumably still develops addition and subtraction circuits. Similarly for text that involves logic and reasoning, even after scrambling word order it would still be helpful to use logic and reasoning to predict which words are likely to be present. The overall argument for why the resulting system would have strong, general capabilities seems to still go through.
In addition, I don’t know why you expect that intelligence can’t be implemented through “a truly massive ensemble of simple heuristics”.
But I don’t think many people believe that we could get AGI by making a sufficiently big random forest classifier for next-token prediction. Why is that?
Huh, really? I think that’s pretty plausible, for all the same reasons that I think it’s plausible in the neural network case. (Though not as likely, since I haven’t seen the scaling laws for random forests extend as far as in the neural network case, and the analogy to human brains seems slightly weaker.) Why don’t you think a big random forest classifier could lead to AGI?
Can you think of a natural way to represent a LoT in a vector space, that a neural network might plausibly learn, without being “forced” by the training data?
But it is “forced” by the training data? The argument here is that text prediction is hard enough that the only way the network can do it (to a very very high standard) is to develop these sorts of representation?
I certainly agree that a randomly initialized network is not going to have sensible representations, just as I’d predict that a randomly initialized human brain is going to have sensible representations (modulo maybe some innate representations encoded by the genome). I assume you are saying something different from that but I’m not sure what.
it does not seem as though it works in the same way as a deep learning system.
But why not? If I were to say “it seems as though the human brain works like a deep learning system, while of course being implemented somewhat differently”, how would you argue against that?
Why wouldn’t I be able to fold a protein? At least if the size of the relevant state space is similar to that of eg chess.
Oh, is your point “LLMs do not have a general notion of search that they can apply to arbitrary problems”? I agree this is currently true, whereas humans do have this. This doesn’t seem too relevant to me, and I don’t buy defining memorization as “things that are not general-purpose search” and then saying “things that do memorization are not intelligent”, that seems too strong.
The argument would have to be that humans are more strategic with what to remember, and what to forget.
Do you actually endorse that response? Seems mostly false to me, except inasmuch as humans can write things down on external memory (which I expect an LLM could also easily do, we just haven’t done that yet).
The general rule I’m following is “if the argument would say false things about humans, then don’t update on it”.
Yes, this is of course very sensible. However, I don’t see why these arguments would apply to humans, unless you make some additional assumption or connection that I am not making. Considering the rest of the conversation, I assume the difference is that you draw a stronger analogy between brains and deep learning systems than I do?
I want to ask a question that goes something like “how correlated is your credence that arguments 5-10 apply to human brains with your credence that human brains and deep learning systems are analogous in important sense X”? But because I don’t quite know what your beliefs are, or why you say that arguments 5-10 apply to humans, I find it hard to formulate this question in the right way.
For example, regarding argument 7 (language of thought), consider the following two propositions:
Some part of the human brain is hard-coded to use LoT-like representations, and the way that these representations are updated in response to experience is not analogous to gradient descent.
Updating the parameters of a neural network with gradient descent is very unlikely to yield (and maintain) LoT-like representations.
These claims could both be true simultaneously, no? Why, concretely, do you think that arguments 5-10 apply to human brains?
I’m not seeing why that’s evidence for the perspective. Even when word order is scrambled, if you see “= 32 44 +” and you have to predict the remaining number, you should predict some combination of 76, 12, and −12 to get optimal performance; to do that you need to be able to add and subtract, so the model presumably still develops addition and subtraction circuits. Similarly for text that involves logic and reasoning, even after scrambling word order it would still be helpful to use logic and reasoning to predict which words are likely to be present. The overall argument for why the resulting system would have strong, general capabilities seems to still go through.
It is empirically true that the resulting system has strong and general capabilities, there is no need to question that. What I mean is that this is evidence that those capabilities are a result of information processing that is quite dissimilar from what humans do, which in turn opens up the possibility that those processes could not be re-tooled to create the kind of system that could take over the world. In particular, they could be much more shallow than they seem.
It is not hard to argue that a model with general capabilities for reasoning, hypothesis generation, and world modelling, etc, would get a good score at the task of an LLM. However, I think one of the central lessons from the history of AI is that there probably also are many other ways to get a good score at this task.
In addition, I don’t know why you expect that intelligence can’t be implemented through “a truly massive ensemble of simple heuristics”.
Given a sufficiently loose definition of “intelligence”, I would expect that you almost certainly could do this. However, if we instead consider systems that would be able to overpower humanity, or very significantly shorten the amount of time before such a system could be created, then it is much less clear to me.
Why don’t you think a big random forest classifier could lead to AGI?
I don’t rule out the possibility, but it seems unlikely that such a system could learn representations and circuits that would enable sufficiently strong out-of-distribution generalisation.
But it is “forced” by the training data? The argument here is that text prediction is hard enough that the only way the network can do it (to a very very high standard) is to develop these sorts of representation?
I think this may be worth zooming in on. One of the main points I’m trying to get at is that it is not just the asymptotic behaviour of the system that matters; two other (plausibly connected) things which are at least as important is how well the system generalises out-of-distribution, and how much data it needs to attain that performance. In other words, how good it is at extrapolating from observed examples to new situations. A system could be very bad at this, and yet eventually with enough training data get good in-distribution performance.
The main point of LoT-like representations would be a better ability to generalise. This benefit is removed if you could only learn LoT-like representation by observing training data corresponding to all the cases you would like to generalise to.
I certainly agree that a randomly initialized network is not going to have sensible representations, just as I’d predict that a randomly initialized human brain is going to have sensible representations (modulo maybe some innate representations encoded by the genome). I assume you are saying something different from that but I’m not sure what.
Yes, I am not saying that.
Maybe if I rephrase it this way; to get us to AGI, LLMs would need to have a sufficiently good inductive bias, but I’m not convinced that they actually have a sufficiently good inductive bias.
But why not? If I were to say “it seems as though the human brain works like a deep learning system, while of course being implemented somewhat differently”, how would you argue against that?
It is hard for me to argue against this, without knowing in more detail what you mean by “like”, and “somewhat differently”, as well as knowing what pieces of evidence underpin this belief/impression.
I would be quite surprised if there aren’t important high-level principles in common between deep learning and at least parts of the human brain (it would be a bit too much of a coincidence if not). However, this does not mean that deep learning (in its current form) captures most of the important factors behind human intelligence. Given that there are both clear physiological differences (some of which seem more significant than others) and many behavioural differences, I think that the default should be to assume that there are important principles of human cognition that are not captured by (current) deep learning.
I know several arguments in favour of drawing a strong analogy between the brain and deep learning, and I have arguments against those arguments. However, I don’t know if you believe in any of these arguments (eg, some of them are arguments like “the brain is made out of neurons, therefore deep learning”), so I don’t want to type out long replies before I know why you believe that human brains work like deep learning systems.
Oh, is your point “LLMs do not have a general notion of search that they can apply to arbitrary problems”? I agree this is currently true, whereas humans do have this. This doesn’t seem too relevant to me, and I don’t buy defining memorization as “things that are not general-purpose search” and then saying “things that do memorization are not intelligent”, that seems too strong.
Yes, that was my point. I’m definitely not saying that intelligence = search, I just brought this up as an example of a case where GPT3 has an impressive ability, but where the mechanism behind that ability is better construed as “memorising the training data” rather than “understanding the problem”. The fact that the example involved search was coincidental.
Do you actually endorse that response? Seems mostly false to me, except inasmuch as humans can write things down on external memory (which I expect an LLM could also easily do, we just haven’t done that yet).
I don’t actually know much about this, but that is the impression I have got from speaking with people who work on this. Introspectively, it also feels like it’s very non-random what I remember. But if we want to go deeper into this track, I would probably need to look more closely at the research first.
However, I don’t see why these arguments would apply to humans
Okay, I’ll take a stab at this.
6. Word Prediction is not Intelligence
“The kinds of humans that we are worried about are the kinds of humans that can do original scientific research and autonomously form plans for taking over the world. Human brains learn to take actions and plans that previously led to high rewards (outcomes like eating food when hungry, having sex, etc)*. These two things are fundamentally not the same thing. Why, exactly, would we expect that a system that is good at the latter necessarily would be able to do the former?”
*I expect that this isn’t a fully accurate description of human brains, but I expect that if we did write the full description the argument would sound the same.
7. The Language of Thought
“This, in turn, suggests a data structure that is discrete and combinatorial, with syntax trees, etc, and humans do (according to the argument) not use such representations. We should therefore expect humans to at some point hit a wall or limit to what they are able to do.”
(I find it hard to make the argument here because there is no argument—it’s just flatly asserted that neural networks don’t use such representations, so all I can do is flatly assert that humans don’t use such representations. If I had to guess, you would say something like “matrix multiplications don’t seem like they can be discrete and combinatorial”, to which I would say “the strength of brain neuron synapse firings doesn’t seem like it can be discrete and combinatorial”.)
8. Programs vs Circuits
We know, from computer science, that it is very powerful to be able to reason in terms of variables and operations on variables. It seems hard to see how you could have human-level intelligence without this ability. However, humans do not typically have this ability, with most human brains instead being more analogous to Boolean circuits, given their finite size and architecture of neuron connections.
9. Generalisation vs Memorisation
In this one I’d give the protein folding example, but apparently you think you’d be able to fold proteins just as well as you’d be able to play chess if they had similar state space sizes, which seems pretty wild to me.
Do you perhaps agree that you would have a hard time navigating in a 10-D space? Clearly you have simply memorized a bunch of heuristics that together are barely sufficient for navigating 3-D space, rather than truly understanding the underlying algorithm for navigating spaces.
10. Catastrophic Forgetting
(Discussed previously, I think humans are not very deliberate / selective about what they do / don’t forget, except when they use external tools.)
In some other parts, I feel like in many places you are being one-sidedly skeptical, e.g.
“In particular, they could be much more shallow than they seem.”
They could also be much more general than they seem.
I don’t rule out the possibility, but it seems unlikely that such a system could learn representations and circuits that would enable sufficiently strong out-of-distribution generalisation.
Perhaps it would enable even stronger OOD generalisation than we have currently.
There could be good reasons for being one-sidedly skeptical, but I think you need to actually say what the reasons are. E.g. I directionally agree with you on the random forests case, but my reason for being one-sidedly skeptical is “we probably would have noticed if random forests generalized better and used them instead of neural nets, so probably they don’t generalize better”. Another potential argument is “decision trees learn arbitrary piecewise linear decision boundaries, whereas neural nets learn manifolds, reality seems more likely to be the second one” (tbc I don’t necessarily agree with this).
“” The kinds of humans that we are worried about are the kinds of humans that can do original scientific research and autonomously form plans for taking over the world. Human brains learn to take actions and plans that previously led to high rewards (outcomes like eating food when hungry, having sex, etc)*. These two things are fundamentally not the same thing. Why, exactly, would we expect that a system that is good at the latter necessarily would be able to do the former?” “”
This feels like a bit of a digression, but we do have concrete examples of systems that are good at eating food when hungry, having sex, and etc, without being able to do original scientific research and autonomously form plans for taking over the world; animals. And the difference between humans and animals isn’t just that humans have more training data (or even that we are that much better at survival and reproduction in the environment of evolutionary adaptation). But I should also note that I consider argument 6 to be one of the weaker arguments I know of.
”” We know, from computer science, that it is very powerful to be able to reason in terms of variables and operations on variables. It seems hard to see how you could have human-level intelligence without this ability. However, humans do not typically have this ability, with most human brains instead being more analogous to Boolean circuits, given their finite size and architecture of neuron connections. ”“
The fact that human brains have a finite size and architecture of neuron connections does not mean that they are well-modelled as Boolean circuits. For example, a (real-world) computer is better modelled as a Turing machine than as a finite-state automaton, even though there is a sense in which they actually are finite-state automata.
The brain is made out of neurons, yes, but it matters a great deal how those neurons are connected. Depending on the answer to that question, you could end up with a system that behaves more like Boolean circuits, or more like a Turing machine, or more like something else.
With neural networks, the training algorihtm and the architecture together determine how the neurons end up connected, and therefore, if the resulting system is better thought of as a Boolean circuit, or a Turing machine, or otherwise. If the wiring of the brain is determined by a different mechanism than what determines the wiring of a deep learning system, then the two systems could end up with very different properties, even if they are made out of similar kinds of parts.
With the brain, we don’t know what determines the wiring. This makes it difficult to draw strong conclusions about the high-level behaviour of brains from their low-level physiology. With deep learning, it is easier to do this.
”” I find it hard to make the argument here because there is no argument—it’s just flatly asserted that neural networks don’t use such representations, so all I can do is flatly assert that humans don’t use such representations. If I had to guess, you would say something like “matrix multiplications don’t seem like they can be discrete and combinatorial”, to which I would say “the strength of brain neuron synapse firings doesn’t seem like it can be discrete and combinatorial”. ”″
What representations you end up with does not just depend on the model space, it also depends on the learning algorithm. Matrix multiplications can be discrete and combinatorial. The question is if those are the kinds of representations that you in fact would end up with when you train a neural network by gradient descent, which to me seems unlikely. The brain does (most likely) not use gradient descent, so the argument does not apply to the brain.
“” Do you perhaps agree that you would have a hard time navigating in a 10-D space? Clearly you have simply memorized a bunch of heuristics that together are barely sufficient for navigating 3-D space, rather than truly understanding the underlying algorithm for navigating spaces. ”“
It would obviously be harder for me to do this, and narrow heuristics are obviously an important part of intelligence. But I could do it, which suggests a stronger transfer ability than what would be suggested if I couldn’t do this.
”” In some other parts, I feel like in many places you are being one-sidedly skeptical. ”″
Yes, as I said, my goal with this post is not to present a balanced view of the issue. Rather, my goal is just to summarise as many arguments as possible for being skeptical of strong scaling. This makes the skepticism one-sided in some places.
Arguments 6-10 seem like the most interesting ones (as they respond more directly to the argument). But for all of them except argument 6, it seems like the same argument would imply that humans would not be generally intelligent.
Because text on the Internet sometimes involves people using logic, reasoning, hypothesis generation, analyzing experimental evidence, etc, and so plausibly the simplest program that successfully predicts that text would do so by replicating that logic, reasoning etc, which you could then chain together to make scientific progress.
What does the argument say in response?
How do you know neural networks won’t use such representations? What is true of human brains but not of neural networks such that human brains can do this but neural networks can’t?
(Particularly interested in this one since you said you found it compelling.)
What is true of human brains but not neural networks such that human brains can represent programs but neural networks can’t?
(I’d note that I’m including chain-of-thought as a way that neural networks can run programs.)
I would bet that you can play chess, but you cannot fold a protein (even if the rules for protein were verbally described to you). What’s the difference?
Why doesn’t this apply to humans as well? We forget stuff all the time.
Why is that?
There are a few ways to respond.
First of all, what comes after “plausibly” could just turn out to be wrong. Many people thought human-level chess would require human-like strategising, but this turned out to be wrong (though the case for text prediction is certainly more convincing).
Secondly, an LLM is almost certainly not learning the lowest K-complexity program for text prediction, and given that, the case becomes less clear. For example, suppose an LLM instead learns a truly massive ensemble of simple heuristics, that together produce human-like text. It seems plausible that such an ensemble could produce convincing results, but without replicating logic, reasoning, and etc. IBM-Watson did something along these lines. Studies such as this one also provide some evidence for this perspective.
To give an intuition pump, suppose we trained an extremely large random forest classifier on the same data as GPT3 was trained on. How good would the output of this classifier be? While it would probably not be as good as GPT3, it would probably still be very impressive. And a random forest classifier is also a universal function approximator, whose performance keeps improving as it is given more training data. I’m sure there are scaling laws for them. But I don’t think many people believe that we could get AGI by making a sufficiently big random forest classifier for next-token prediction. Why is that? I have found this to be an interesting prompt to think about. For me, a gestalt shift that makes long time lines seem plausible is to look at LLMs sort of like how you would look at a giant random forest classifier.
(Also, just to reiterate, I am not personally convinced of long time-lines, I am just trying to make the best arguments for this view more easily available.)
I can’t say this for sure, especially not for newer or more exotic architectures, but it does certainly not seem like these are the kinds of representations that deep learning systems are likely to learn. Rather, they seems much more likely to learn manifold-like representations, where proximity corresponds to relevant similarity, or something along those lines. Syntactically organised, combinatorial representations are certainly not very “native” to the deep learning paradigm.
It is worth clarifying that neural networks of course in principle could implement these representations, at least in the same sense as how a Boolean network can implement a Turing machine. The question is if they in practice can learn such representations in a reasonable way. Consider the example I gave with how an MLP can’t learn an identity function, unless the training data essentially forces it to memorise one. The question is whether or not a similar thing is true of LoT-style representations. Can you think of a natural way to represent a LoT in a vector space, that a neural network might plausibly learn, without being “forced” by the training data?
As an extremely simple example, a CNN and an MLP will in practice not learn the same kinds of representations, even though the CNN model space is contained in the MLP model space (if you make them wide enough). How do I know that an MLP won’t learn a CNN-like representation? Because these representations are not “natural” to MLPs, and the MLP will not be explicitly incentivised to learn them. My sense is that most deep learning systems are inclined away from LoT-like representations for similar reasons.
A human brain is not a tabula rasa system trained by gradient descent. I don’t know how a human brain is organised, what learning algorithms are used, or what parts are learnt as opposed to innate, etc, but it does not seem as though it works in the same way as a deep learning system.
Here I will again just say that a human brain isn’t a tabula rasa system trained by gradient descent, so it is not inherently surprising for one of the two to have a property that the other one does not.
Chain-of-thought and attention mechanisms do certainly do seem to bring deep learning systems much closer to the ability to reason in terms of variables. Whether or not it is sufficient, I do not know.
Why wouldn’t I be able to fold a protein? At least if the size of the relevant state space is similar to that of eg chess.
(Also, to be clear, GPT-3 struggles with verbally described mazes with as few as ~5 states.)
The argument would have to be that humans are more strategic with what to remember, and what to forget.
Meta: A lot of this seems to have the following form:
I think you are misunderstanding what I am trying to do here. I’m not trying to claim that humans and neural networks will have the same properties or be identical. I’m trying to evaluate how much I should update on the particular argument you have provided. The general rule I’m following is “if the argument would say false things about humans, then don’t update on it”. It may in fact be the case that humans and neural networks differ on that property, but if so it will be for some other reason. There is a general catchall category of “maybe something I didn’t think about makes humans and neural networks different on this property”, and indeed I even assign it decently high probability, but that doesn’t affect how much I should update on this particular argument.
Responding to particular pieces:
The rest of the comment was justifying that.
I’m not seeing why that’s evidence for the perspective. Even when word order is scrambled, if you see “= 32 44 +” and you have to predict the remaining number, you should predict some combination of 76, 12, and −12 to get optimal performance; to do that you need to be able to add and subtract, so the model presumably still develops addition and subtraction circuits. Similarly for text that involves logic and reasoning, even after scrambling word order it would still be helpful to use logic and reasoning to predict which words are likely to be present. The overall argument for why the resulting system would have strong, general capabilities seems to still go through.
In addition, I don’t know why you expect that intelligence can’t be implemented through “a truly massive ensemble of simple heuristics”.
Huh, really? I think that’s pretty plausible, for all the same reasons that I think it’s plausible in the neural network case. (Though not as likely, since I haven’t seen the scaling laws for random forests extend as far as in the neural network case, and the analogy to human brains seems slightly weaker.) Why don’t you think a big random forest classifier could lead to AGI?
But it is “forced” by the training data? The argument here is that text prediction is hard enough that the only way the network can do it (to a very very high standard) is to develop these sorts of representation?
I certainly agree that a randomly initialized network is not going to have sensible representations, just as I’d predict that a randomly initialized human brain is going to have sensible representations (modulo maybe some innate representations encoded by the genome). I assume you are saying something different from that but I’m not sure what.
But why not? If I were to say “it seems as though the human brain works like a deep learning system, while of course being implemented somewhat differently”, how would you argue against that?
Oh, is your point “LLMs do not have a general notion of search that they can apply to arbitrary problems”? I agree this is currently true, whereas humans do have this. This doesn’t seem too relevant to me, and I don’t buy defining memorization as “things that are not general-purpose search” and then saying “things that do memorization are not intelligent”, that seems too strong.
Do you actually endorse that response? Seems mostly false to me, except inasmuch as humans can write things down on external memory (which I expect an LLM could also easily do, we just haven’t done that yet).
Yes, this is of course very sensible. However, I don’t see why these arguments would apply to humans, unless you make some additional assumption or connection that I am not making. Considering the rest of the conversation, I assume the difference is that you draw a stronger analogy between brains and deep learning systems than I do?
I want to ask a question that goes something like “how correlated is your credence that arguments 5-10 apply to human brains with your credence that human brains and deep learning systems are analogous in important sense X”? But because I don’t quite know what your beliefs are, or why you say that arguments 5-10 apply to humans, I find it hard to formulate this question in the right way.
For example, regarding argument 7 (language of thought), consider the following two propositions:
Some part of the human brain is hard-coded to use LoT-like representations, and the way that these representations are updated in response to experience is not analogous to gradient descent.
Updating the parameters of a neural network with gradient descent is very unlikely to yield (and maintain) LoT-like representations.
These claims could both be true simultaneously, no? Why, concretely, do you think that arguments 5-10 apply to human brains?
It is empirically true that the resulting system has strong and general capabilities, there is no need to question that. What I mean is that this is evidence that those capabilities are a result of information processing that is quite dissimilar from what humans do, which in turn opens up the possibility that those processes could not be re-tooled to create the kind of system that could take over the world. In particular, they could be much more shallow than they seem.
It is not hard to argue that a model with general capabilities for reasoning, hypothesis generation, and world modelling, etc, would get a good score at the task of an LLM. However, I think one of the central lessons from the history of AI is that there probably also are many other ways to get a good score at this task.
Given a sufficiently loose definition of “intelligence”, I would expect that you almost certainly could do this. However, if we instead consider systems that would be able to overpower humanity, or very significantly shorten the amount of time before such a system could be created, then it is much less clear to me.
I don’t rule out the possibility, but it seems unlikely that such a system could learn representations and circuits that would enable sufficiently strong out-of-distribution generalisation.
I think this may be worth zooming in on. One of the main points I’m trying to get at is that it is not just the asymptotic behaviour of the system that matters; two other (plausibly connected) things which are at least as important is how well the system generalises out-of-distribution, and how much data it needs to attain that performance. In other words, how good it is at extrapolating from observed examples to new situations. A system could be very bad at this, and yet eventually with enough training data get good in-distribution performance.
The main point of LoT-like representations would be a better ability to generalise. This benefit is removed if you could only learn LoT-like representation by observing training data corresponding to all the cases you would like to generalise to.
Yes, I am not saying that.
Maybe if I rephrase it this way; to get us to AGI, LLMs would need to have a sufficiently good inductive bias, but I’m not convinced that they actually have a sufficiently good inductive bias.
It is hard for me to argue against this, without knowing in more detail what you mean by “like”, and “somewhat differently”, as well as knowing what pieces of evidence underpin this belief/impression.
I would be quite surprised if there aren’t important high-level principles in common between deep learning and at least parts of the human brain (it would be a bit too much of a coincidence if not). However, this does not mean that deep learning (in its current form) captures most of the important factors behind human intelligence. Given that there are both clear physiological differences (some of which seem more significant than others) and many behavioural differences, I think that the default should be to assume that there are important principles of human cognition that are not captured by (current) deep learning.
I know several arguments in favour of drawing a strong analogy between the brain and deep learning, and I have arguments against those arguments. However, I don’t know if you believe in any of these arguments (eg, some of them are arguments like “the brain is made out of neurons, therefore deep learning”), so I don’t want to type out long replies before I know why you believe that human brains work like deep learning systems.
Yes, that was my point. I’m definitely not saying that intelligence = search, I just brought this up as an example of a case where GPT3 has an impressive ability, but where the mechanism behind that ability is better construed as “memorising the training data” rather than “understanding the problem”. The fact that the example involved search was coincidental.
I don’t actually know much about this, but that is the impression I have got from speaking with people who work on this. Introspectively, it also feels like it’s very non-random what I remember. But if we want to go deeper into this track, I would probably need to look more closely at the research first.
Okay, I’ll take a stab at this.
“The kinds of humans that we are worried about are the kinds of humans that can do original scientific research and autonomously form plans for taking over the world. Human brains learn to take actions and plans that previously led to high rewards (outcomes like eating food when hungry, having sex, etc)*. These two things are fundamentally not the same thing. Why, exactly, would we expect that a system that is good at the latter necessarily would be able to do the former?”
*I expect that this isn’t a fully accurate description of human brains, but I expect that if we did write the full description the argument would sound the same.
“This, in turn, suggests a data structure that is discrete and combinatorial, with syntax trees, etc, and humans do (according to the argument) not use such representations. We should therefore expect humans to at some point hit a wall or limit to what they are able to do.”
(I find it hard to make the argument here because there is no argument—it’s just flatly asserted that neural networks don’t use such representations, so all I can do is flatly assert that humans don’t use such representations. If I had to guess, you would say something like “matrix multiplications don’t seem like they can be discrete and combinatorial”, to which I would say “the strength of brain neuron synapse firings doesn’t seem like it can be discrete and combinatorial”.)
We know, from computer science, that it is very powerful to be able to reason in terms of variables and operations on variables. It seems hard to see how you could have human-level intelligence without this ability. However, humans do not typically have this ability, with most human brains instead being more analogous to Boolean circuits, given their finite size and architecture of neuron connections.
In this one I’d give the protein folding example, but apparently you think you’d be able to fold proteins just as well as you’d be able to play chess if they had similar state space sizes, which seems pretty wild to me.
Do you perhaps agree that you would have a hard time navigating in a 10-D space? Clearly you have simply memorized a bunch of heuristics that together are barely sufficient for navigating 3-D space, rather than truly understanding the underlying algorithm for navigating spaces.
(Discussed previously, I think humans are not very deliberate / selective about what they do / don’t forget, except when they use external tools.)
In some other parts, I feel like in many places you are being one-sidedly skeptical, e.g.
“In particular, they could be much more shallow than they seem.”
They could also be much more general than they seem.
I don’t rule out the possibility, but it seems unlikely that such a system could learn representations and circuits that would enable sufficiently strong out-of-distribution generalisation.
Perhaps it would enable even stronger OOD generalisation than we have currently.
There could be good reasons for being one-sidedly skeptical, but I think you need to actually say what the reasons are. E.g. I directionally agree with you on the random forests case, but my reason for being one-sidedly skeptical is “we probably would have noticed if random forests generalized better and used them instead of neural nets, so probably they don’t generalize better”. Another potential argument is “decision trees learn arbitrary piecewise linear decision boundaries, whereas neural nets learn manifolds, reality seems more likely to be the second one” (tbc I don’t necessarily agree with this).
“”
The kinds of humans that we are worried about are the kinds of humans that can do original scientific research and autonomously form plans for taking over the world. Human brains learn to take actions and plans that previously led to high rewards (outcomes like eating food when hungry, having sex, etc)*. These two things are fundamentally not the same thing. Why, exactly, would we expect that a system that is good at the latter necessarily would be able to do the former?”
“”
This feels like a bit of a digression, but we do have concrete examples of systems that are good at eating food when hungry, having sex, and etc, without being able to do original scientific research and autonomously form plans for taking over the world; animals. And the difference between humans and animals isn’t just that humans have more training data (or even that we are that much better at survival and reproduction in the environment of evolutionary adaptation). But I should also note that I consider argument 6 to be one of the weaker arguments I know of.
””
We know, from computer science, that it is very powerful to be able to reason in terms of variables and operations on variables. It seems hard to see how you could have human-level intelligence without this ability. However, humans do not typically have this ability, with most human brains instead being more analogous to Boolean circuits, given their finite size and architecture of neuron connections.
”“
The fact that human brains have a finite size and architecture of neuron connections does not mean that they are well-modelled as Boolean circuits. For example, a (real-world) computer is better modelled as a Turing machine than as a finite-state automaton, even though there is a sense in which they actually are finite-state automata.
The brain is made out of neurons, yes, but it matters a great deal how those neurons are connected. Depending on the answer to that question, you could end up with a system that behaves more like Boolean circuits, or more like a Turing machine, or more like something else.
With neural networks, the training algorihtm and the architecture together determine how the neurons end up connected, and therefore, if the resulting system is better thought of as a Boolean circuit, or a Turing machine, or otherwise. If the wiring of the brain is determined by a different mechanism than what determines the wiring of a deep learning system, then the two systems could end up with very different properties, even if they are made out of similar kinds of parts.
With the brain, we don’t know what determines the wiring. This makes it difficult to draw strong conclusions about the high-level behaviour of brains from their low-level physiology. With deep learning, it is easier to do this.
””
I find it hard to make the argument here because there is no argument—it’s just flatly asserted that neural networks don’t use such representations, so all I can do is flatly assert that humans don’t use such representations. If I had to guess, you would say something like “matrix multiplications don’t seem like they can be discrete and combinatorial”, to which I would say “the strength of brain neuron synapse firings doesn’t seem like it can be discrete and combinatorial”.
”″
What representations you end up with does not just depend on the model space, it also depends on the learning algorithm. Matrix multiplications can be discrete and combinatorial. The question is if those are the kinds of representations that you in fact would end up with when you train a neural network by gradient descent, which to me seems unlikely. The brain does (most likely) not use gradient descent, so the argument does not apply to the brain.
“”
Do you perhaps agree that you would have a hard time navigating in a 10-D space? Clearly you have simply memorized a bunch of heuristics that together are barely sufficient for navigating 3-D space, rather than truly understanding the underlying algorithm for navigating spaces.
”“
It would obviously be harder for me to do this, and narrow heuristics are obviously an important part of intelligence. But I could do it, which suggests a stronger transfer ability than what would be suggested if I couldn’t do this.
””
In some other parts, I feel like in many places you are being one-sidedly skeptical.
”″
Yes, as I said, my goal with this post is not to present a balanced view of the issue. Rather, my goal is just to summarise as many arguments as possible for being skeptical of strong scaling. This makes the skepticism one-sided in some places.