Belief in Intelligence
Since I am so uncertain of Kasparov’s moves, what is the empirical content of my belief that “Kasparov is a highly intelligent chess player”? What real-world experience does my belief tell me to anticipate? Is it a cleverly masked form of total ignorance?
To sharpen the dilemma, suppose Kasparov plays against some mere chess grandmaster Mr. G, who’s not in the running for world champion. My own ability is far too low to distinguish between these levels of chess skill. When I try to guess Kasparov’s move, or Mr. G’s next move, all I can do is try to guess “the best chess move” using my own meager knowledge of chess. Then I would produce exactly the same prediction for Kasparov’s move or Mr. G’s move in any particular chess position. So what is the empirical content of my belief that “Kasparov is a better chess player than Mr. G”?
The empirical content of my belief is the testable, falsifiable prediction that the final chess position will occupy the class of chess positions that are wins for Kasparov, rather than drawn games or wins for Mr. G. (Counting resignation as a legal move that leads to a chess position classified as a loss.) The degree to which I think Kasparov is a “better player” is reflected in the amount of probability mass I concentrate into the “Kasparov wins” class of outcomes, versus the “drawn game” and “Mr. G wins” class of outcomes. These classes are extremely vague in the sense that they refer to vast spaces of possible chess positions—but “Kasparov wins” is more specific than maximum entropy, because it can be definitely falsified by a vast set of chess positions.
The outcome of Kasparov’s game is predictable because I know, and understand, Kasparov’s goals. Within the confines of the chess board, I know Kasparov’s motivations—I know his success criterion, his utility function, his target as an optimization process. I know where Kasparov is ultimately trying to steer the future and I anticipate he is powerful enough to get there, although I don’t anticipate much about how Kasparov is going to do it.
Imagine that I’m visiting a distant city, and a local friend volunteers to drive me to the airport. I don’t know the neighborhood. Each time my friend approaches a street intersection, I don’t know whether my friend will turn left, turn right, or continue straight ahead. I can’t predict my friend’s move even as we approach each individual intersection—let alone, predict the whole sequence of moves in advance.
Yet I can predict the result of my friend’s unpredictable actions: we will arrive at the airport. Even if my friend’s house were located elsewhere in the city, so that my friend made a completely different sequence of turns, I would just as confidently predict our arrival at the airport. I can predict this long in advance, before I even get into the car. My flight departs soon, and there’s no time to waste; I wouldn’t get into the car in the first place, if I couldn’t confidently predict that the car would travel to the airport along an unpredictable pathway.
Isn’t this a remarkable situation to be in, from a scientific perspective? I can predict the outcome of a process, without being able to predict any of the intermediate steps of the process.
How is this even possible? Ordinarily one predicts by imagining the present and then running the visualization forward in time. If you want a precise model of the Solar System, one that takes into account planetary perturbations, you must start with a model of all major objects and run that model forward in time, step by step.
Sometimes simpler problems have a closed-form solution, where calculating the future at time T takes the same amount of work regardless of T. A coin rests on a table, and after each minute, the coin turns over. The coin starts out showing heads. What face will it show a hundred minutes later? Obviously you did not answer this question by visualizing a hundred intervening steps. You used a closed-form solution that worked to predict the outcome, and would also work to predict any of the intervening steps.
But when my friend drives me to the airport, I can predict the outcome successfully using a strange model that won’t work to predict any of the intermediate steps. My model doesn’t even require me to input the initial conditions—I don’t need to know where we start out in the city!
I do need to know something about my friend. I must know that my friend wants me to make my flight. I must credit that my friend is a good enough planner to successfully drive me to the airport (if he wants to). These are properties of my friend’s initial state—properties which let me predict the final destination, though not any intermediate turns.
I must also credit that my friend knows enough about the city to drive successfully. This may be regarded as a relation between my friend and the city; hence, a property of both. But an extremely abstract property, which does not require any specific knowledge about either the city, or about my friend’s knowledge about the city.
This is one way of viewing the subject matter to which I’ve devoted my life—these remarkable situations which place us in such an odd epistemic positions. And my work, in a sense, can be viewed as unraveling the exact form of that strange abstract knowledge we can possess; whereby, not knowing the actions, we can justifiably know the consequence.
“Intelligence” is too narrow a term to describe these remarkable situations in full generality. I would say rather “optimization process”. A similar situation accompanies the study of biological natural selection, for example; we can’t predict the exact form of the next organism observed.
But my own specialty is the kind of optimization process called “intelligence”; and even narrower, a particular kind of intelligence called “Friendly Artificial Intelligence”—of which, I hope, I will be able to obtain especially precise abstract knowledge.
- Hero Licensing by 21 Nov 2017 21:13 UTC; 240 points) (
- Building Phenomenological Bridges by 23 Dec 2013 19:57 UTC; 95 points) (
- Meaning & Agency by 19 Dec 2023 22:27 UTC; 91 points) (
- Unnatural Categories Are Optimized for Deception by 8 Jan 2021 20:54 UTC; 89 points) (
- Aiming at the Target by 26 Oct 2008 16:47 UTC; 40 points) (
- Building Something Smarter by 2 Nov 2008 17:00 UTC; 26 points) (
- Groundwork for AGI safety engineering by 6 Aug 2014 21:29 UTC; 24 points) (
- 15 Mar 2012 15:14 UTC; 20 points) 's comment on The Futility of Intelligence by (
- Walkthrough of ‘Formalizing Convergent Instrumental Goals’ by 26 Feb 2018 2:20 UTC; 13 points) (
- 24 Dec 2022 23:12 UTC; 10 points) 's comment on Read The Sequences by (EA Forum;
- Mathematical Measures of Optimization Power by 24 Nov 2012 10:55 UTC; 8 points) (
- Rationality Reading Group: Part M: Fragile Purposes by 5 Nov 2015 2:08 UTC; 6 points) (
- Why Are Posts in the Sequences Tagged [Personal Blog] Instead of [Frontpage]? by 27 Jun 2022 9:35 UTC; 5 points) (
- [SEQ RERUN] Belief in Intelligence by 5 Oct 2012 4:55 UTC; 5 points) (
- 13 Apr 2024 13:30 UTC; 2 points) 's comment on Consequentialism is a compass, not a judge by (
- 7 Sep 2013 18:18 UTC; 2 points) 's comment on The genie knows, but doesn’t care by (
- 7 Dec 2008 18:50 UTC; 2 points) 's comment on Is That Your True Rejection? by (
- 3 Sep 2009 23:25 UTC; 2 points) 's comment on Rationality Quotes—September 2009 by (
- 1 Feb 2010 17:29 UTC; 1 point) 's comment on Open Thread: February 2010 by (
- 7 Dec 2011 14:22 UTC; 1 point) 's comment on Why safe Oracle AI is easier than safe general AI, in a nutshell by (
- 4 Jul 2012 17:26 UTC; 1 point) 's comment on Can anyone explain to me why CDT two-boxes? by (
- 22 Jan 2024 2:54 UTC; 1 point) 's comment on Optimisation Measures: Desiderata, Impossibility, Proposals by (
- 27 Oct 2009 14:55 UTC; 1 point) 's comment on Why the beliefs/values dichotomy? by (
- 24 May 2009 19:43 UTC; 0 points) 's comment on Off Topic Thread: May 2009 by (
I don’t know that it’s that impressive. If we launch a pinball in a pinball machine, we may have a devil of a time calculating the path off all the bumpers, but we know that the pinball is going to wind up fallin in the hole in the middle. Is gravity really such a genius?
This is confused about who/what the agent is and about assumed goals.
The final question suggests that the agent is gravity. Nobody thinks that the goal/value function of gravity is to make the pinball fall in the hole—At a first approximation, its goal is to have ALL objects fall to earth and we observe it thwarted in that goal almost all the time, the pinball happens to be a rare success.
If we were to suggest that the pinball machine were the agent that might make more sense but then we would say that the pinball machine does not make any decisions and so cannot be an agent.
The first level at which agency makes any sense is when considering the agency of the pinball designer -The goal of the designer is to produce a game that attracts players and has a playtime within a preferred range even for skilled players. The designer is intelligent.
It seems to me that you are predicting the path of the pinball, but quickly enough that you don’t realize you’re doing it. It’s such a fundamental axiom that if there is a clear downward path to a given position, this position will be reached, that it’s easy to forget that it was originally reasoning about intermediate steps that led to this axiom. At most points the pinball can reach, it is expected to move down. At the next point, it’s expected to move down again. You would inductively expect it to reach a point where it cannot move down anymore, and this point is the hole (or sometimes a fault in the machine).
Contrast with the hole being upraised, or blocked by some barrier. All of the paths you envision lead to a point other than the hole, so you conclude that the ball will land instead on some other array of points. There it’s easier to see that gravity still requires path-based reasoning.
No, gravity isn’t genius, because even a well-designed pinball machine can have a fault that lets the ball get stuck somewhere. You are using a closed-form solution, not an optimization process anticipation.
Gravity doesn’t let water flow uphill a little in order to flow downhill a lot later.
You might not be aware of my reaction times or how good I am at pinball, but you may anticipate with high probability that if I play this pinball game, the pinball will not just wind up falling straight in the hole.
Not to detract from your main point, but superfluids do exactely that.
Superfluids still do not jump to higher energy states in order to descend to a lower one afterwards. Each atomic interaction a superfluid exhibits will always conserve energy and be, on average, entropic.
Each move in an optimization process will increase expected utility...
Yes. The only variables are the Utility Function, the Search Space Compression and the Primitive Action Set.
Reminds how real “abstract” concepts are, when they get translated into simple physical events. The concept of winning, though intermediary of abstract inference, gets translated into individual physical moves, which bring the environment into a winning state. The concept of a world champion captures the whole process, and translates into an expectation of winning, maybe into an action of betting. And if you can’t accurately translate these concepts into individual moves, you can’t control the winning moves.
“Is gravity really such a genius?”
Gravity may not be a genius, but it’s still an optimization problem, since the ball “wants” to minimize its potential energy. Of course, there are limits to such a reasoning: perhaps the ball will get stuck somewhere and never reach the lowest-energy state.
Nature continually runs its own optimisation process. It maximises entropy. That’s why water runs downhill, why gas expands into a vaccum, and why pinballs fall down their holes. For details, see my Bright Light essay, and the work of Roderick Dewar.
I don’t think increasing entropy can be considered an optimization process. It’s not moving towards a narrow goal set. It’s moving towards a wide one. What’s more, the low entropy results are not disproportionately unlikely a result. The increase in entropy will not make a glass of water any less likely to spontaneously freeze than to exist in any other configuration. You’re basically saying that you’re 99.9999% sure that the outcome will be in a class that contains 99.9999% of all states. You know nothing.
I wouldn’t count gravity (or the pinball machine) as an optimization process, and I think Elizier would say the same, only better. Once you start saying that gravity, gas expansion etc. are optimization process, it kinda sounds like everything is an optimization process, which isn’t very useful.
The pinball machine question is still usefuln because it helps refine the concept. I’d say the difference between the pinball machine and Kasparov is that once you know the probability distributions for every individual “choice” of the machine (whether a ball will bounce left or right, etc.), you know all about the machine and can use those distributions to prove the ball will reach the bottom. (If it’s a bit hard to imagine for a pinball machine, so imagine a simplified model with a bunch of tubes going around, and some “random” nodes where the ball may fall left or right—like a pinball or patchinko machine, but with a countable number of paths)
Unlike the pinball machine, Having a probability distribution for each of Kasparov’s choices isn’t enough to predict the result of the Kasparov-Elizier game.
You need to know that Kasparov is trying to win to predict the result.
There’s no equivalent knowledge for the Pinball machine—having a model of each of it’s possible “choices” is enough.
(damn, it’s Pachinko, not patchinko)
You still need to know about gravity to make local predictions about the path of the ball in the pinball machine, but unlike for Kasparov, you don’t need to think at all about future or final states.
(Damn, it’s Eliezer, not Elizier. Sorry!)
A similar phenomenon arises in trying to bound the error of a numerical computation by running it using interval arithmetic. The result is conservative, but sometimes useful. However, once in a while one applies it to a slowly converging iterative process that produces an accurate answer. Lots of arithmetic leads to large intervals even though the error to be bounded is small.
We also have concepts such as “wealth” and “power” which describe an agent’s ability to achieve its goals. Will you be distinguishing “intelligence” from these, or are they synonyms for your purposes?
Wealth and intelligence are both kinds of power. The most important distinction may be that we think of wealth as outside the agent, whereas intelligence is internal to the agent. So if you change the agent’s environment while leaving the agent intact, its intelligence will remain constant, but its wealth will fluctuate.
If it’s hard to exactly define where an agent’s boundaries are, the distinction might become merely conventional. For instance, if my normal level of intelligence is reliant on ingesting a certain food or nootropic (or meme), then that resource may be viewed as part of my intelligence (like my brain’s architecture), or it may be viewed as an environmental factor that modulates how effective my intelligence is at shaping the particular environment it’s in.
“Wealth” refers to possession of valuable things specifically. “Power” seems to be more the potential to optimize rather than the act of actually optimizing.
Pinball machines are optimization processes to move quarters in through a little slot and out through a little locked door.
@Nominull: Not impressive? The genius is in making a pinball machine that works. Even in the simplest case: a ‘plinko’ game, the board, kasparov, and the driver are all finely tuned results of some other optimization process.
Using the terms as Eliezer has, can you offer an example of a phenomenon that is NOT an optimization?
Giving an example of a phenomenon that is not an optimization is like giving an example of something without a weight. It’s a sliding scale. Everything optimizes, but some things optimize better than others. A pebble doesn’t optimize much of anything. An animal can optimize its inclusive genetic fitness, but not that well. Humans are better at optimizing, but don’t always optimize the same thing, and often work at cross-purposes against themselves.
Could you explain your analogy? In our universe, some things don’t have mass; a sliding scale can have a 0 point, and it might be that in a certain universe almost everything falls on that 0. If your optimization power is 0 on the scale, in a sense you’re an atrocious optimization process; but I think it’s a bit clearer to say that you aren’t an optimization process at all.
I guess you can feasibly get zero with some things, but this is more like hitting distance zero from a bullseye. If you score 0.0001, you’re an atrocious optimizer. If you’re a little worse, you score −0.0001 and you’re actually optimizing for the opposite effect, in which case you have score 0.0001. If you pick an entity and a goal system at random, you probably won’t get a very high score for optimization, and it will be negative half the time, but it will almost never be zero. In order for an entity to not optimize any goal system, it would have to score a perfect zero for every goal system. It’s not going to happen, unless you count “nothing” as your entity.
Wait a minute. Not everything in our universe is real-valued, much less continuous. Unless you’re saying that an optimization goal must produce a well-ordering of possible environment states (which isn’t true for any definition of optimization I’ve ever heard of in an AI context), it should be fairly easy to come up with an objective that generates a cost function returning zero for many possible hypotheses.
For example, “optimize the number of electoral votes I get in the next US presidential election”.
You mean an ordering? The reals aren’t well-ordered.
If there’s no ordering, there’s circular preferences.
In any case, that’s not what I was talking about.
Compare the expected number of electoral votes with and without the optimizer. The difference gives you how powerful the optimizer is, and it will almost never be zero.
That’s equivalent to asserting that the axiom of choice is untrue. That’s not derivable from the other axioms, and in fact the axiom of choice is often used in mathematics. (This is entirely irrelevant to the [long-dead] discussion, however.)
Shoot, you’re right. I believe I meant a strict ordering; it’s been a while since I last studied set theory.
I’m confused as to what you mean by an optimizer now, though. It sounds like you mean something along the lines of a utility-based agent, but expected utility in this context is an attribute of a hypothesis relative to a model, not of the hypothesis relative to the world, and we’re just as free to define models as we are to define optimization objectives. Previously I’d been thinking in terms of a more general agent, which needn’t use a concept of utility and whose performance relative to an objective is found in retrospect.
It doesn’t need to use utility explicitly. It’s just whatever objective it tends to gravitate towards.
I’m not entirely sure what you’re saying in the rest of the comment.
The reason I’m talking about “expected value” is that an optimizer must be able to work in a variety of environments. This is equivalent to talking about a probability distribution of environments.
I mean a well-ordering, though I’ll admit that was a bit unclear in context. Possible environment states are a set, not points on the real line.
I feel like I’ve read this exact post before. Deja Vu?
Entropy-generating processes are the ones that perform optimisation—specifically processes where there a range of different outcomes with different entropies. Gravity barely qualifies—but friction certainly does—and it is not gravity but friction that keeps the pinball down.
You need to learn control theory. There is nothing strange about the situation you describe: this is a characteristic of all control systems, and hence of pretty much any prediction you try to make about living organisms (or AIs).
If I set the room thermostat to 20°C, I can predict what the temperature will be in the room for the indefinite future. I will not be able to predict when it will turn the heating on and off (or in hotter places, the air conditioning), because that will depend on the weather outside, which I cannot predict, and the number of people or other power sources in the room, about which I may know nothing. I can do no better than a very mushy prediction that the heating will be turned on for a greater proportion of the night than the day, and less during a LAN party. Not knowing the actions, we can justifiably know the consequence. This is an entirely unmysterious fact about control systems—it is what they do.
It is also isomorphic to the situations of predicting that Kasparov will beat Mr. G and that the driver will arrive at the airport. People act so as to achieve goals. The actions will depend on the ongoing circumstances: each action will be whatever is effective at that moment in reaching the goal. In all three examples, the circumstances cannot be predicted, therefore the actions cannot be predicted. Not knowing the actions, we can justifiably know the consequence.
We can justifiably know the consequence because the result is not merely a consequence of the actions: the actions were chosen to produce the result. That is why we need not know of the actions. We only need know that there is a system in place that is able to choose such actions. Even when it is a black box, we can tell that there is a control system inside by observing the very fact that the result is consistent while the actions vary.
For the thermostat, we know exactly how it chooses its actions. For the driver’s task, we almost know enough to duplicate the feat, if not the actual mechanism in the driver. For Kasparov, we know almost nothing about how he wins at chess. But this is merely ignorance, not mysteriousness. His track record demonstrates that he can defeat all ordinary masters of the game. Knowing that he can, we can predict that he will.
On the contrary, imagining the present and running the visualisation forwards is a very bad method of making predictions. You can only do it successfully for very simple systems, such as the planets orbiting the sun, and it doesn’t work very well even for that (replacing “imagining” by “measuring” and “running the visualization forward in time” by “numerically solving Newton’s laws by iterating through time”). It does not work at all for any control system, because its actions of the moment depend not only on the goal, but also on the current situation.
For example:
Last Friday there was announced a meet-up of OB people for Saturday. Several announced on OB that they would be there. I predict that the meeting took place, and that most of those who said they would be there were. If the meeting did happen, presumably everyone who went made a similar prediction, and set out expecting to meet the others.
Did anyone who made such a prediction do so by imagining the present and working forwards? I cannot see how such a thing could be done. You might not know—I certainly do not—where each of them would have been beforehand, what means of transportation they had available, and so on. I have no information to base such an imagining on. Those who went might be more well-informed about each other, but not to the extent of carrying out the computation. What I do know is the goals that were expressed, and on that basis alone I can predict that most of those goals were accomplished. I predict the result, while predicting none of the actions that caused the result, and imagining no trajectory linking the RSVPs to Saturday evening.
Richard, You’re making the exact point Eliezer just did, about how modeling the effects of intelligence doesn’t generally proceed by running a simulation forward. The “ordinarily” he speaks of, I assume, refers to the vast majority of physical systems in the Universe, in which there are no complicated optimization processes (especially intelligences) affecting outcomes on the relevant scales.
Patrick(orthonormal): The “ordinarily” he speaks of, I assume, refers to the vast majority of physical systems in the Universe, in which there are no complicated optimization processes (especially intelligences) affecting outcomes on the relevant scales.
My point is that modelling the effects of unintelligence doesn’t generally proceed by running a simulation forward either. No intelligence and no optimisation processes, complicated or otherwise, need be present for the system to be unpredictable by this method. The room thermostat is not intelligent. My robot is not intelligent. Neither do they optimise anything. Here is Eliezer’s own example of an “ordinary” system:
But this is, in fact, not how astronomers precisely predict the future positions of the bodies of the Solar System. They do not “run a model forward in time, step by step”. Instead, from observations they compute a set of parameters (“orbital elements”) of the closed-form solution to the two-body problem (the second body being the Sun), then add perturbative adjustments to account for interactions between bodies. This is a more accurate method, at least when the bodies do not perturb each other too much. It also allows future positions to be predicted without computing any of the intermediate positions—without, that is, running a model forwards in time.
So not only does forward simulation not work at all for intelligent systems, neither does it work at all for unintelligent control systems, and it does not even work very well for a bunch of dumb rocks.
Both forward simulation and the method of elements and perturbations are mathematically derived from Newton’s laws. How is it that the latter method can predict where an asteroid will be at a point in the future, not only without computing its entire trajectory, but more accurately than if we did? It is because Newton’s laws tell us more than the moment-to-moment evolution. They mathematically imply long-range properties of that evolution that allow these predictions to be made.
Oh the comments here are a sad state of affairs. :(
And this is an excessively important article too.
A lot of the things that ancient cultures attributed to God are this kind of thinking.
If you see a dead pig on the side of the road with no signs of violence, stay the heck away from it. You don’t have to know which specific disease it died of, or even what a disease is. People have just noticed that anyone who goes near such a thing tends to die horribly later and maybe takes half the tribe with them. The precise intermediate steps are largely irrelevant, just the statistical correlation.
There are two failure modes to watch out for.
The first is when people start worshiping their own ignorance and refuse to update the rules as their understanding of the underlying principles improves.
The second is when people recognize that the idea of “God” as an old man with a long beard who lives in the clouds is patently ridiculous and assume therefore that all of the principles and rules intended to “stay his wrath” may be ignored with utter impunity.
To the first type I generally point out that whatever creator they believe exists gave us our intelligence as well, and refusing to use that gift to the utmost would be an insult.
To the second I like to suggest that, since “Thor” is imaginary, maybe they should go stand in an open field and wave a metal stick around during the next thunderstorm… A “primitive” understanding of something is not the same as being stupid, and a few thousand years of experience that says, “If you do X, bad things happen,” should not be ignored lightly.