To sharpen the dilemma, suppose Kasparov plays against some mere chess grandmaster Mr. G, who’s not in the running for world champion. My own ability is far too low to distinguish between these levels of chess skill. When I try to guess Kasparov’s move, or Mr. G’s next move, all I can do is try to guess “the best chess move” using my own meager knowledge of chess. Then I would produce exactly the same prediction for Kasparov’s move or Mr. G’s move in any particular chess position. So what is the empirical content of my belief that “Kasparov is a better chess player than Mr. G”?
The empirical content of my belief is the testable, falsifiable prediction that the final chess position will occupy the class of chess positions that are wins for Kasparov, rather than drawn games or wins for Mr. G. (Counting resignation as a legal move that leads to a chess position classified as a loss.) The degree to which I think Kasparov is a “better player” is reflected in the amount of probability mass I concentrate into the “Kasparov wins” class of outcomes, versus the “drawn game” and “Mr. G wins” class of outcomes. These classes are extremely vague in the sense that they refer to vast spaces of possible chess positions—but “Kasparov wins” is more specific than maximum entropy, because it can be definitely falsified by a vast set of chess positions.
The outcome of Kasparov’s game is predictable because I know, and understand, Kasparov’s goals. Within the confines of the chess board, I know Kasparov’s motivations—I know his success criterion, his utility function, his target as an optimization process. I know where Kasparov is ultimately trying to steer the future and I anticipate he is powerful enough to get there, although I don’t anticipate much about how Kasparov is going to do it.
Imagine that I’m visiting a distant city, and a local friend volunteers to drive me to the airport. I don’t know the neighborhood. Each time my friend approaches a street intersection, I don’t know whether my friend will turn left, turn right, or continue straight ahead. I can’t predict my friend’s move even as we approach each individual intersection—let alone, predict the whole sequence of moves in advance.
Yet I can predict the result of my friend’s unpredictable actions: we will arrive at the airport. Even if my friend’s house were located elsewhere in the city, so that my friend made a completely different sequence of turns, I would just as confidently predict our arrival at the airport. I can predict this long in advance, before I even get into the car. My flight departs soon, and there’s no time to waste; I wouldn’t get into the car in the first place, if I couldn’t confidently predict that the car would travel to the airport along an unpredictable pathway.
Isn’t this a remarkable situation to be in, from a scientific perspective? I can predict the outcome of a process, without being able to predict any of the intermediate steps of the process.
How is this even possible? Ordinarily one predicts by imagining the present and then running the visualization forward in time. If you want a precise model of the Solar System, one that takes into account planetary perturbations, you must start with a model of all major objects and run that model forward in time, step by step.
Sometimes simpler problems have a closed-form solution, where calculating the future at time T takes the same amount of work regardless of T. A coin rests on a table, and after each minute, the coin turns over. The coin starts out showing heads. What face will it show a hundred minutes later? Obviously you did not answer this question by visualizing a hundred intervening steps. You used a closed-form solution that worked to predict the outcome, and would also work to predict any of the intervening steps.
But when my friend drives me to the airport, I can predict the outcome successfully using a strange model that won’t work to predict any of the intermediate steps. My model doesn’t even require me to input the initial conditions—I don’t need to know where we start out in the city!
I do need to know something about my friend. I must know that my friend wants me to make my flight. I must credit that my friend is a good enough planner to successfully drive me to the airport (if he wants to). These are properties of my friend’s initial state—properties which let me predict the final destination, though not any intermediate turns.
I must also credit that my friend knows enough about the city to drive successfully. This may be regarded as a relation between my friend and the city; hence, a property of both. But an extremely abstract property, which does not require any specific knowledge about either the city, or about my friend’s knowledge about the city.
This is one way of viewing the subject matter to which I’ve devoted my life—these remarkable situations which place us in such an odd epistemic positions. And my work, in a sense, can be viewed as unraveling the exact form of that strange abstract knowledge we can possess; whereby, not knowing the actions, we can justifiably know the consequence.
“Intelligence” is too narrow a term to describe these remarkable situations in full generality. I would say rather “optimization process”. A similar situation accompanies the study of biological natural selection, for example; we can’t predict the exact form of the next organism observed.
But my own specialty is the kind of optimization process called “intelligence”; and even narrower, a particular kind of intelligence called “Friendly Artificial Intelligence”—of which, I hope, I will be able to obtain especially precise abstract knowledge.
Belief in Intelligence
Since I am so uncertain of Kasparov’s moves, what is the empirical content of my belief that “Kasparov is a highly intelligent chess player”? What real-world experience does my belief tell me to anticipate? Is it a cleverly masked form of total ignorance?
To sharpen the dilemma, suppose Kasparov plays against some mere chess grandmaster Mr. G, who’s not in the running for world champion. My own ability is far too low to distinguish between these levels of chess skill. When I try to guess Kasparov’s move, or Mr. G’s next move, all I can do is try to guess “the best chess move” using my own meager knowledge of chess. Then I would produce exactly the same prediction for Kasparov’s move or Mr. G’s move in any particular chess position. So what is the empirical content of my belief that “Kasparov is a better chess player than Mr. G”?
The empirical content of my belief is the testable, falsifiable prediction that the final chess position will occupy the class of chess positions that are wins for Kasparov, rather than drawn games or wins for Mr. G. (Counting resignation as a legal move that leads to a chess position classified as a loss.) The degree to which I think Kasparov is a “better player” is reflected in the amount of probability mass I concentrate into the “Kasparov wins” class of outcomes, versus the “drawn game” and “Mr. G wins” class of outcomes. These classes are extremely vague in the sense that they refer to vast spaces of possible chess positions—but “Kasparov wins” is more specific than maximum entropy, because it can be definitely falsified by a vast set of chess positions.
The outcome of Kasparov’s game is predictable because I know, and understand, Kasparov’s goals. Within the confines of the chess board, I know Kasparov’s motivations—I know his success criterion, his utility function, his target as an optimization process. I know where Kasparov is ultimately trying to steer the future and I anticipate he is powerful enough to get there, although I don’t anticipate much about how Kasparov is going to do it.
Imagine that I’m visiting a distant city, and a local friend volunteers to drive me to the airport. I don’t know the neighborhood. Each time my friend approaches a street intersection, I don’t know whether my friend will turn left, turn right, or continue straight ahead. I can’t predict my friend’s move even as we approach each individual intersection—let alone, predict the whole sequence of moves in advance.
Yet I can predict the result of my friend’s unpredictable actions: we will arrive at the airport. Even if my friend’s house were located elsewhere in the city, so that my friend made a completely different sequence of turns, I would just as confidently predict our arrival at the airport. I can predict this long in advance, before I even get into the car. My flight departs soon, and there’s no time to waste; I wouldn’t get into the car in the first place, if I couldn’t confidently predict that the car would travel to the airport along an unpredictable pathway.
Isn’t this a remarkable situation to be in, from a scientific perspective? I can predict the outcome of a process, without being able to predict any of the intermediate steps of the process.
How is this even possible? Ordinarily one predicts by imagining the present and then running the visualization forward in time. If you want a precise model of the Solar System, one that takes into account planetary perturbations, you must start with a model of all major objects and run that model forward in time, step by step.
Sometimes simpler problems have a closed-form solution, where calculating the future at time T takes the same amount of work regardless of T. A coin rests on a table, and after each minute, the coin turns over. The coin starts out showing heads. What face will it show a hundred minutes later? Obviously you did not answer this question by visualizing a hundred intervening steps. You used a closed-form solution that worked to predict the outcome, and would also work to predict any of the intervening steps.
But when my friend drives me to the airport, I can predict the outcome successfully using a strange model that won’t work to predict any of the intermediate steps. My model doesn’t even require me to input the initial conditions—I don’t need to know where we start out in the city!
I do need to know something about my friend. I must know that my friend wants me to make my flight. I must credit that my friend is a good enough planner to successfully drive me to the airport (if he wants to). These are properties of my friend’s initial state—properties which let me predict the final destination, though not any intermediate turns.
I must also credit that my friend knows enough about the city to drive successfully. This may be regarded as a relation between my friend and the city; hence, a property of both. But an extremely abstract property, which does not require any specific knowledge about either the city, or about my friend’s knowledge about the city.
This is one way of viewing the subject matter to which I’ve devoted my life—these remarkable situations which place us in such an odd epistemic positions. And my work, in a sense, can be viewed as unraveling the exact form of that strange abstract knowledge we can possess; whereby, not knowing the actions, we can justifiably know the consequence.
“Intelligence” is too narrow a term to describe these remarkable situations in full generality. I would say rather “optimization process”. A similar situation accompanies the study of biological natural selection, for example; we can’t predict the exact form of the next organism observed.
But my own specialty is the kind of optimization process called “intelligence”; and even narrower, a particular kind of intelligence called “Friendly Artificial Intelligence”—of which, I hope, I will be able to obtain especially precise abstract knowledge.