This is a story from when I first met Marcello, with whom I would later work for a year on AI theory; but at this point I had not yet accepted him as my apprentice. I knew that he competed at the national level in mathematical and computing olympiads, which sufficed to attract my attention for a closer look; but I didn’t know yet if he could learn to think about AI.
I had asked Marcello to say how he thought an AI might discover how to solve a Rubik’s Cube. Not in a preprogrammed way, which is trivial, but rather how the AI itself might figure out the laws of the Rubik universe and reason out how to exploit them. How would an AI invent for itself the concept of an “operator,” or “macro,” which is the key to solving the Rubik’s Cube?
At some point in this discussion, Marcello said: “Well, I think the AI needs complexity to do X, and complexity to do Y—”
And I said, “Don’t say ‘complexity.’ ”
Marcello said, “Why not?”
I said, “Complexity should never be a goal in itself. You may need to use a particular algorithm that adds some amount of complexity, but complexity for the sake of complexity just makes things harder.” (I was thinking of all the people whom I had heard advocating that the Internet would “wake up” and become an AI when it became “sufficiently complex.”)
And Marcello said, “But there’s got to be some amount of complexity that does it.”
I closed my eyes briefly, and tried to think of how to explain it all in words. To me, saying “complexity” simply felt like the wrong move in the AI dance. No one can think fast enough to deliberate, in words, about each sentence of their stream of consciousness; for that would require an infinite recursion. We think in words, but our stream of consciousness is steered below the level of words, by the trained-in remnants of past insights and harsh experience . . .
I said, “Did you read ‘A Technical Explanation of Technical Explanation’?”1
“Yes,” said Marcello.
“Okay,” I said. “Saying ‘complexity’ doesn’t concentrate your probability mass.”
“Oh,” Marcello said, “like ‘emergence.’ Huh. So . . . now I’ve got to think about how X might actually happen . . .”
That was when I thought to myself, “Maybe this one is teachable.”
Complexity is not a useless concept. It has mathematical definitions attached to it, such as Kolmogorov complexity and Vapnik-Chervonenkis complexity. Even on an intuitive level, complexity is often worth thinking about—you have to judge the complexity of a hypothesis and decide if it’s “too complicated” given the supporting evidence, or look at a design and try to make it simpler.
But concepts are not useful or useless of themselves. Only usages are correct or incorrect. In the step Marcello was trying to take in the dance, he was trying to explain something for free, get something for nothing. It is an extremely common misstep, at least in my field. You can join a discussion on artificial general intelligence and watch people doing the same thing, left and right, over and over again—constantly skipping over things they don’t understand, without realizing that’s what they’re doing.
In an eyeblink it happens: putting a non-controlling causal node behind something mysterious, a causal node that feels like an explanation but isn’t. The mistake takes place below the level of words. It requires no special character flaw; it is how human beings think by default, how they have thought since the ancient times.
What you must avoid is skipping over the mysterious part; you must linger at the mystery to confront it directly. There are many words that can skip over mysteries, and some of them would be legitimate in other contexts—“complexity,” for example. But the essential mistake is that skip-over, regardless of what causal node goes behind it. The skip-over is not a thought, but a microthought. You have to pay close attention to catch yourself at it. And when you train yourself to avoid skipping, it will become a matter of instinct, not verbal reasoning. You have to feel which parts of your map are still blank, and more importantly, pay attention to that feeling.
I suspect that in academia there is a huge pressure to sweep problems under the rug so that you can present a paper with the appearance of completeness. You’ll get more kudos for a seemingly complete model that includes some “emergent phenomena,” versus an explicitly incomplete map where the label says “I got no clue how this part works” or “then a miracle occurs.” A journal may not even accept the latter paper, since who knows but that the unknown steps are really where everything interesting happens?2
And if you’re working on a revolutionary AI startup, there is an even huger pressure to sweep problems under the rug; or you will have to admit to yourself that you don’t know how to build the right kind of AI yet, and your current life plans will come crashing down in ruins around your ears. But perhaps I am over-explaining, since skip-over happens by default in humans. If you’re looking for examples, just watch people discussing religion or philosophy or spirituality or any science in which they were not professionally trained.
Marcello and I developed a convention in our AI work: when we ran into something we didn’t understand, which was often, we would say “magic”—as in, X magically does Y”—to remind ourselves that here was an unsolved problem, a gap in our understanding. It is far better to say “magic” than “complexity” or “emergence”; the latter words create an illusion of understanding. Wiser to say “magic,” and leave yourself a placeholder, a reminder of work you will have to do later.
2 And yes, it sometimes happens that all the non-magical parts of your map turn out to also be non-important. That’s the price you sometimes pay, for entering into terra incognita and trying to solve problems incrementally. But that makes it even more important to know when you aren’t finished yet. Mostly, people don’t dare to enter terra incognita at all, for the deadly fear of wasting their time.
Say Not “Complexity”
Once upon a time . . .
This is a story from when I first met Marcello, with whom I would later work for a year on AI theory; but at this point I had not yet accepted him as my apprentice. I knew that he competed at the national level in mathematical and computing olympiads, which sufficed to attract my attention for a closer look; but I didn’t know yet if he could learn to think about AI.
I had asked Marcello to say how he thought an AI might discover how to solve a Rubik’s Cube. Not in a preprogrammed way, which is trivial, but rather how the AI itself might figure out the laws of the Rubik universe and reason out how to exploit them. How would an AI invent for itself the concept of an “operator,” or “macro,” which is the key to solving the Rubik’s Cube?
At some point in this discussion, Marcello said: “Well, I think the AI needs complexity to do X, and complexity to do Y—”
And I said, “Don’t say ‘complexity.’ ”
Marcello said, “Why not?”
I said, “Complexity should never be a goal in itself. You may need to use a particular algorithm that adds some amount of complexity, but complexity for the sake of complexity just makes things harder.” (I was thinking of all the people whom I had heard advocating that the Internet would “wake up” and become an AI when it became “sufficiently complex.”)
And Marcello said, “But there’s got to be some amount of complexity that does it.”
I closed my eyes briefly, and tried to think of how to explain it all in words. To me, saying “complexity” simply felt like the wrong move in the AI dance. No one can think fast enough to deliberate, in words, about each sentence of their stream of consciousness; for that would require an infinite recursion. We think in words, but our stream of consciousness is steered below the level of words, by the trained-in remnants of past insights and harsh experience . . .
I said, “Did you read ‘A Technical Explanation of Technical Explanation’?”1
“Yes,” said Marcello.
“Okay,” I said. “Saying ‘complexity’ doesn’t concentrate your probability mass.”
“Oh,” Marcello said, “like ‘emergence.’ Huh. So . . . now I’ve got to think about how X might actually happen . . .”
That was when I thought to myself, “Maybe this one is teachable.”
Complexity is not a useless concept. It has mathematical definitions attached to it, such as Kolmogorov complexity and Vapnik-Chervonenkis complexity. Even on an intuitive level, complexity is often worth thinking about—you have to judge the complexity of a hypothesis and decide if it’s “too complicated” given the supporting evidence, or look at a design and try to make it simpler.
But concepts are not useful or useless of themselves. Only usages are correct or incorrect. In the step Marcello was trying to take in the dance, he was trying to explain something for free, get something for nothing. It is an extremely common misstep, at least in my field. You can join a discussion on artificial general intelligence and watch people doing the same thing, left and right, over and over again—constantly skipping over things they don’t understand, without realizing that’s what they’re doing.
In an eyeblink it happens: putting a non-controlling causal node behind something mysterious, a causal node that feels like an explanation but isn’t. The mistake takes place below the level of words. It requires no special character flaw; it is how human beings think by default, how they have thought since the ancient times.
What you must avoid is skipping over the mysterious part; you must linger at the mystery to confront it directly. There are many words that can skip over mysteries, and some of them would be legitimate in other contexts—“complexity,” for example. But the essential mistake is that skip-over, regardless of what causal node goes behind it. The skip-over is not a thought, but a microthought. You have to pay close attention to catch yourself at it. And when you train yourself to avoid skipping, it will become a matter of instinct, not verbal reasoning. You have to feel which parts of your map are still blank, and more importantly, pay attention to that feeling.
I suspect that in academia there is a huge pressure to sweep problems under the rug so that you can present a paper with the appearance of completeness. You’ll get more kudos for a seemingly complete model that includes some “emergent phenomena,” versus an explicitly incomplete map where the label says “I got no clue how this part works” or “then a miracle occurs.” A journal may not even accept the latter paper, since who knows but that the unknown steps are really where everything interesting happens?2
And if you’re working on a revolutionary AI startup, there is an even huger pressure to sweep problems under the rug; or you will have to admit to yourself that you don’t know how to build the right kind of AI yet, and your current life plans will come crashing down in ruins around your ears. But perhaps I am over-explaining, since skip-over happens by default in humans. If you’re looking for examples, just watch people discussing religion or philosophy or spirituality or any science in which they were not professionally trained.
Marcello and I developed a convention in our AI work: when we ran into something we didn’t understand, which was often, we would say “magic”—as in, X magically does Y”—to remind ourselves that here was an unsolved problem, a gap in our understanding. It is far better to say “magic” than “complexity” or “emergence”; the latter words create an illusion of understanding. Wiser to say “magic,” and leave yourself a placeholder, a reminder of work you will have to do later.
1 http://lesswrong.com/rationality/a-technical-explanation-of-technical-explanation
2 And yes, it sometimes happens that all the non-magical parts of your map turn out to also be non-important. That’s the price you sometimes pay, for entering into terra incognita and trying to solve problems incrementally. But that makes it even more important to know when you aren’t finished yet. Mostly, people don’t dare to enter terra incognita at all, for the deadly fear of wasting their time.