In “Strong AI Isn’t Here Yet”, Sarah Constantin writes that she believes AGI will require another major conceptual breakthrough in our understanding before it can be built, and it will not simply be scaled up or improved versions of the deep learning algorithms that already exist.
To argue this, she makes the case that current deep learning algorithms have no way to learn “concepts” and only operate on “percepts.” She says:
I suspect that, similarly, we’d have to have understanding of how concepts work on an algorithmic level in order to train conceptual learning.
However, I feel that her argument was lacking in terms of tangible evidence for the claim that deep-learning algorithms do not learn any high-level concepts. It seems to be based on the observation that we currently do not know how to explicitly represent concepts in mathematical or algorithmic terms. But I think if we are to take this as a belief, we should try to predict how the world would look differently if deep-learning algorithms could learn concepts entirely on their own, without us understanding how.
So what kind of problems, if solved by neural networks, would surprise us if this belief was held? Well, to name a couple of experiments that surprise me, I would probably point out DCGAN and InfoGAN. In the former, they are able to extract visual “concepts” out of the generator network by taking the latent vectors of all the examples that share one kind of attribute of their choosing (in the paper they take “smiling” / “not smiling” and “glasses” / “no glasses”) and averaging them. Then they are able to construct new images by doing vector arithmetic in the latent space using this vector and passing them through the generator, so you can take a picture of someone without glasses and add glasses to them without altering the rest of their face, for example. In the second paper, their network learns a secondary latent variable vector that extracts disentagled features from the data. Most surprisingly, their network seems to learn such concepts as “rotation” (among other things) from a data set of 2D faces, even though there is no way to express the concept of three dimensions in this network or have that encoded as prior knowledge somehow.
Just this morning in fact, OpenAI revealed that they had done a very large scale deep-learning experiment using multiplicative LSTMs on Amazon review data. What was more surprising than just the fact they had beaten the benchmark accuracy on sentiment analysis, was that they had done it in an unsupervised manner by using the LSTMs to predict the next character in a given sequence of characters. They discovered that a single neuron in the hidden layer of this LSTM seemed to extract the overall sentiment of the review, and was somehow using this knowledge to get better at predicting the sequence. I would find this very surprising if I believed it were unlikely or impossible for neural networks to extract high-level “concepts” out of data without explicitly encoding it into the network structure or the data.
What I’m getting at here is that we should be able to set benchmarks on certain well-defined problems and say “Any AI that solves this problem has done concept learning and does concept-level reasoning”, and update based on what types of algorithms solve those problems. And when that list of problems gets smaller and smaller, we really need to watch out to see if we have redefined the meaning of “concept” or drawn to the tautological conclusion that the problem really didn’t require concept level reasoning after all. I feel like that has already happened to a certain degree.
The problem with AGI is not that AIs have no ability to learn “concepts”, it’s that the G in ‘AGI’ is very likely ill-defined. Even humans are not ‘general intelligences’, they’re just extremely capable aggregates of narrow intelligences that collectively implement the rather complex task we call “being a human”. Narrow AIs that implement ‘deep learning’ can learn ‘concepts’ that are tailored to their specific task; for instance. the DeepDream AI famously learns a variety of ‘concepts’ that relate to something looking like a dog. And sometimes these concepts turn out to be usable in a different task, but this is essentially a matter of luck. In the Amazon reviews case, the ‘sentiment’ of a review turned out to be a good predictor of what the review would say, even after controlling for the sorts of low-order correlations in the text that character-based RNNs can be expected to model most easily. I don’t see this as especially surprising, or as having much implication about possible ‘AGI’.
Humans are general intelligences, and that is exactly about having completely general concepts. Is there something you cannot think about? Suppose there is. Then let’s think about that thing. There is now nothing you cannot think about. No current computer AI can do this; when they can, they will in fact be AGIs.
In “Strong AI Isn’t Here Yet”, Sarah Constantin writes that she believes AGI will require another major conceptual breakthrough in our understanding before it can be built, and it will not simply be scaled up or improved versions of the deep learning algorithms that already exist.
To argue this, she makes the case that current deep learning algorithms have no way to learn “concepts” and only operate on “percepts.” She says:
However, I feel that her argument was lacking in terms of tangible evidence for the claim that deep-learning algorithms do not learn any high-level concepts. It seems to be based on the observation that we currently do not know how to explicitly represent concepts in mathematical or algorithmic terms. But I think if we are to take this as a belief, we should try to predict how the world would look differently if deep-learning algorithms could learn concepts entirely on their own, without us understanding how.
So what kind of problems, if solved by neural networks, would surprise us if this belief was held? Well, to name a couple of experiments that surprise me, I would probably point out DCGAN and InfoGAN. In the former, they are able to extract visual “concepts” out of the generator network by taking the latent vectors of all the examples that share one kind of attribute of their choosing (in the paper they take “smiling” / “not smiling” and “glasses” / “no glasses”) and averaging them. Then they are able to construct new images by doing vector arithmetic in the latent space using this vector and passing them through the generator, so you can take a picture of someone without glasses and add glasses to them without altering the rest of their face, for example. In the second paper, their network learns a secondary latent variable vector that extracts disentagled features from the data. Most surprisingly, their network seems to learn such concepts as “rotation” (among other things) from a data set of 2D faces, even though there is no way to express the concept of three dimensions in this network or have that encoded as prior knowledge somehow.
Just this morning in fact, OpenAI revealed that they had done a very large scale deep-learning experiment using multiplicative LSTMs on Amazon review data. What was more surprising than just the fact they had beaten the benchmark accuracy on sentiment analysis, was that they had done it in an unsupervised manner by using the LSTMs to predict the next character in a given sequence of characters. They discovered that a single neuron in the hidden layer of this LSTM seemed to extract the overall sentiment of the review, and was somehow using this knowledge to get better at predicting the sequence. I would find this very surprising if I believed it were unlikely or impossible for neural networks to extract high-level “concepts” out of data without explicitly encoding it into the network structure or the data.
What I’m getting at here is that we should be able to set benchmarks on certain well-defined problems and say “Any AI that solves this problem has done concept learning and does concept-level reasoning”, and update based on what types of algorithms solve those problems. And when that list of problems gets smaller and smaller, we really need to watch out to see if we have redefined the meaning of “concept” or drawn to the tautological conclusion that the problem really didn’t require concept level reasoning after all. I feel like that has already happened to a certain degree.
The problem with AGI is not that AIs have no ability to learn “concepts”, it’s that the G in ‘AGI’ is very likely ill-defined. Even humans are not ‘general intelligences’, they’re just extremely capable aggregates of narrow intelligences that collectively implement the rather complex task we call “being a human”. Narrow AIs that implement ‘deep learning’ can learn ‘concepts’ that are tailored to their specific task; for instance. the DeepDream AI famously learns a variety of ‘concepts’ that relate to something looking like a dog. And sometimes these concepts turn out to be usable in a different task, but this is essentially a matter of luck. In the Amazon reviews case, the ‘sentiment’ of a review turned out to be a good predictor of what the review would say, even after controlling for the sorts of low-order correlations in the text that character-based RNNs can be expected to model most easily. I don’t see this as especially surprising, or as having much implication about possible ‘AGI’.
Humans are general intelligences, and that is exactly about having completely general concepts. Is there something you cannot think about? Suppose there is. Then let’s think about that thing. There is now nothing you cannot think about. No current computer AI can do this; when they can, they will in fact be AGIs.