Aiming at the Target
Previously in series: Belief in Intelligence
Previously, I spoke of that very strange epistemic position one can occupy, wherein you don’t know exactly where Kasparov will move on the chessboard, and yet your state of knowledge about the game is very different than if you faced a random move-generator with the same subjective probability distribution—in particular, you expect Kasparov to win. I have beliefs about where Kasparov wants to steer the future, and beliefs about his power to do so.
Well, and how do I describe this knowledge, exactly?
In the case of chess, there’s a simple function that classifies chess positions into wins for black, wins for white, and drawn games. If I know which side Kasparov is playing, I know the class of chess positions Kasparov is aiming for. (If I don’t know which side Kasparov is playing, I can’t predict whether black or white will win—which is not the same as confidently predicting a drawn game.)
More generally, I can describe motivations using a preference ordering. When I consider two potential outcomes, X and Y, I can say that I prefer X to Y; prefer Y to X; or find myself indifferent between them. I would write these relations as X > Y; X < Y; and X ~ Y.
Suppose that you have the ordering A < B ~ C < D ~ E. Then you like B more than A, and C more than A. {B, C}, belonging to the same class, seem equally desirable to you; you are indifferent between which of {B, C} you receive, though you would rather have either than A, and you would rather have something from the class {D, E} than {B, C}.
When I think you’re a powerful intelligence, and I think I know something about your preferences, then I’ll predict that you’ll steer reality into regions that are higher in your preference ordering.
Think of a huge circle containing all possible outcomes, such that outcomes higher in your preference ordering appear to be closer to the center. Outcomes between which you are indifferent are the same distance from the center—imagine concentric rings of outcomes that are all equally preferred. If you aim your actions and strike a consequence close to the center—an outcome that ranks high in your preference ordering—then I’ll think better of your ability to aim.
The more intelligent I believe you are, the more probability I’ll concentrate into outcomes that I believe are higher in your preference ordering—that is, the more I’ll expect you to achieve a good outcome, and the better I’ll expect the outcome to be. Even if a powerful enemy opposes you, so that I expect the final outcome to be one that is low in your preference ordering, I’ll still expect you to lose less badly if I think you’re more intelligent.
What about expected utilities as opposed to preference orderings? To talk about these, you have to attribute a probability distribution to the actor, or to the environment—you can’t just observe the outcome. If you have one of these probability distributions, then your knowledge of a utility function can let you guess at preferences between gambles (stochastic outcomes) and not just preferences between the outcomes themselves.
The “aiming at the target” metaphor—and the notion of measuring how closely we hit—extends beyond just terminal outcomes, to the forms of instrumental devices and instrumental plans.
Consider a car—say, a Toyota Corolla. The Toyota Corolla is made up of some number of atoms—say, on the (very) rough order of ten to the twenty-ninth. If you consider all the possible ways we could arrange those 1029 atoms, it’s clear that only an infinitesimally tiny fraction of possible configurations would qualify as a working car. If you picked a random configurations of 1029 atoms once per Planck time, many ages of the universe would pass before you hit on a wheeled wagon, let alone an internal combustion engine.
(When I talk about this in front of a popular audience, someone usually asks: “But isn’t this what the creationists argue? That if you took a bunch of atoms and put them in a box and shook them up, it would be astonishingly improbable for a fully functioning rabbit to fall out?” But the logical flaw in the creationists’ argument is not that randomly reconfiguring molecules would by pure chance assemble a rabbit. The logical flaw is that there is a process, natural selection, which, through the non-chance retention of chance mutations, selectively accumulates complexity, until a few billion years later it produces a rabbit. Only the very first replicator in the history of time needed to pop out of the random shaking of molecules—perhaps a short RNA string, though there are more sophisticated hypotheses about autocatalytic hypercycles of chemistry.)
Even restricting our attention to running vehicles, there is an astronomically huge design space of possible vehicles that could be composed of the same atoms as the Corolla, and most of them, from the perspective of a human user, won’t work quite as well. We could take the parts in the Corolla’s air conditioner, and mix them up in thousands of possible configurations; nearly all these configurations would result in a vehicle lower in our preference ordering, still recognizable as a car but lacking a working air conditioner.
So there are many more configurations corresponding to nonvehicles, or vehicles lower in our preference ranking, than vehicles ranked greater than or equal to the Corolla.
A tiny fraction of the design space does describe vehicles that we would recognize as faster, more efficient, and safer than the Corolla. Thus the Corolla is not optimal under our preferences, nor under the designer’s own goals. The Corolla is, however, optimized, because the designer had to hit an infinitesimal target in design space just to create a working car, let alone a car of Corolla-equivalent quality. The subspace of working vehicles is dwarfed by the space of all possible molecular configurations for the same atoms. You cannot build so much as an effective wagon by sawing boards into random shapes and nailing them together according to coinflips. To hit such a tiny target in configuration space requires a powerful optimization process. The better the car you want, the more optimization pressure you have to exert—though you need a huge optimization pressure just to get a car at all.
This whole discussion assumes implicitly that the designer of the Corolla was trying to produce a “vehicle”, a means of travel. This assumption deserves to be made explicit, but it is not wrong, and it is highly useful in understanding the Corolla.
Planning also involves hitting tiny targets in a huge search space. On a 19-by-19 Go board there are roughly 1e180 legal positions (not counting superkos). On early positions of a Go game there are more than 300 legal moves per turn. The search space explodes, and nearly all moves are foolish ones if your goal is to win the game. From all the vast space of Go possibilities, a Go player seeks out the infinitesimal fraction of plans which have a decent chance of winning.
You cannot even drive to the supermarket without planning—it will take you a long, long time to arrive if you make random turns at each intersection. The set of turn sequences that will take you to the supermarket is a tiny subset of the space of turn sequences. Note that the subset of turn sequences we’re seeking is defined by its consequence—the target—the destination. Within that subset, we care about other things, like the driving distance. (There are plans that would take us to the supermarket in a huge pointless loop-the-loop.)
In general, as you live your life, you try to steer reality into a particular region of possible futures. When you buy a Corolla, you do it because you want to drive to the supermarket. You drive to the supermarket to buy food, which is a step in a larger strategy to avoid starving. All else being equal, you prefer possible futures in which you are alive, rather than dead of starvation.
When you drive to the supermarket, you aren’t really aiming for the supermarket, you’re aiming for a region of possible futures in which you don’t starve. Each turn at each intersection doesn’t carry you toward the supermarket, it carries you out of the region of possible futures where you lie helplessly starving in your apartment. If you knew the supermarket was empty, you wouldn’t bother driving there. An empty supermarket would occupy exactly the same place on your map of the city, but it wouldn’t occupy the same role in your map of possible futures. It is not a location within the city that you are really aiming at, when you drive.
Human intelligence is one kind of powerful optimization process, capable of winning a game of Go or turning sand into digital computers. Natural selection is much slower than human intelligence; but over geological time, cumulative selection pressure qualifies as a powerful optimization process.
Once upon a time, human beings anthropomorphized stars, saw constellations in the sky and battles between constellations. But though stars burn longer and brighter than any craft of biology or human artifice, stars are neither optimization processes, nor products of strong optimization pressures. The stars are not gods; there is no true power in them.
- What I mean by “alignment is in large part about making cognition aimable at all” by 30 Jan 2023 15:22 UTC; 170 points) (
- Subagents, akrasia, and coherence in humans by 25 Mar 2019 14:24 UTC; 138 points) (
- Measuring Optimization Power by 27 Oct 2008 21:44 UTC; 89 points) (
- Unnatural Categories Are Optimized for Deception by 8 Jan 2021 20:54 UTC; 89 points) (
- Maybe Lying Can’t Exist?! by 23 Aug 2020 0:36 UTC; 58 points) (
- Alignment is mostly about making cognition aimable at all by 30 Jan 2023 15:22 UTC; 57 points) (EA Forum;
- Free to Optimize by 2 Jan 2009 1:41 UTC; 56 points) (
- Adjectives from the Future: The Dangers of Result-based Descriptions by 11 Aug 2019 19:19 UTC; 19 points) (
- Mathematical Measures of Optimization Power by 24 Nov 2012 10:55 UTC; 8 points) (
- Requisite Variety by 21 Apr 2023 8:07 UTC; 6 points) (
- 26 Mar 2021 18:58 UTC; 6 points) 's comment on Toward A Bayesian Theory Of Willpower by (
- Why Are Posts in the Sequences Tagged [Personal Blog] Instead of [Frontpage]? by 27 Jun 2022 9:35 UTC; 5 points) (
- 11 Mar 2009 17:55 UTC; 5 points) 's comment on Selective processes bring tag-alongs (but not always!) by (
- [SEQ RERUN] Aiming at the Target by 7 Oct 2012 3:44 UTC; 4 points) (
- 29 May 2021 5:07 UTC; 1 point) 's comment on Adjectives from the Future: The Dangers of Result-based Descriptions by (
- 2 Nov 2008 13:35 UTC; 0 points) 's comment on BHTV: Jaron Lanier and Yudkowsky by (
I agree, generally, but I’ve got a quibble with your last paragraph. I’m tentatively in agreement that stars are not, qua stars, optimization processes, but I’m less certain that stars do not contain optimization processes. And I’m tentatively certain that stars are the products of strong optimization pressures; how likely is star formation? Doesn’t cosmological/astronomical evolution (i.e. the ‘rules’ by which it occurs) count as a (powerful) form of selection? There are innumerable dust clouds in the visible universe that never became stars.
Have you read ‘A New Kind of Science’ (yes, it’s a pretentious title) by Stephen Wolfram? He has a number of interesting discussions of intelligence. Your recent posts re: intelligence and optimization processes remind me of Wolfram’s statement (I’m greatly paraphrasing) that a sufficiently general definition of intelligence (in terms of information processing and something akin to your optimization processes) would necessarily include all kinds of entities that we would not categorize as intentional.
Stars are a strictly accidental, natural formation given gravity and non-homogenous, sufficiently large gas clouds. That’s that.
Nuclear computing is a whole different matter. If you can overcome the thermodynamic activity at upwards of 15 billion kelvin, then it is definitely a possibility. Personally I would assign higher probability to concentric dyson-sphere brains.
I’m curious what you think about this after reading these 17 pages from the book I mentioned. The point I was making relied on the idea (or perspective) that everything already is computing. The point then being that the physical processes within stars might be sufficiently advanced computationally that they could be considered, in some sense, intelligent.
Intelligence is optimization power. Boltzmann brains don’t have a lot.
Tromp gives 2.081681994 * 10^170.
Tromp gives 2.081681994 * 10^170.
How do you avoid conflating intelligence with power? (Or do you, in fact, think that the two are best regarded as different facets of the same thing?) I’d have more ability to steer reality into regions I like if I were cleverer—but also if I were dramatically richer or better-connected.
Re intelligence vs. power: think about it as quality of engine vs. how you apply the engine. You can have a perfect car, and still drown it in a pool. Quality of engine is abstract knowledge about it, it’s not derived from performance of that engine on a specific race track. If you are highly intelligent, but get hit by a stone fallen from the sky, it doesn’t mean that it was your preferred outcome. When the outcome turns out wrong each time, it shows that your abstract knowledge was wrong, but quality of engine is not outcome itself.
http://en.wikipedia.org/wiki/Intentional_stance
Kenny, compared to cumulative processes like natural selection that mutate and select over and over—let alone human intelligence that navigates a compressed search space—to select one star cloud out of a million possibles is only 20 bits of information at most. And considering the variance between star clouds, that 20 bits of information won’t go very far. You can’t expect to find complex functionally optimized machinery within stars on the order of what exists in the smallest bacterium. If evolution has not gotten started anywhere else in the galaxy, then I fully expect that the rest of the entire Milky Way contains less interesting complexity than one Earth butterfly.
See e.g. “No Evolutions for Corporations or Nanodevices”.
Vladimir, if I understand both you and Eliezer correctly you’re saying that Eliezer is saying not “intelligence is reality-steering ability” but “intelligence is reality-steering ability modulo available resources”. That makes good sense, but that definition is only usable in so far as you have some separate way of estimating an agent’s available resources, and comparing the utility of what might be very different sets of available resources. (Compare a nascent superintelligent AI, with no ability to influence the world directly other than by communicating with people, with someone carrying a whole lot of powerful weapons. Who has the better available resources? Depends on context—and on the intelligence of the two.) Eliezer, I think, is proposing a way of evaluating the “intelligence” of an agent about which we know very little, including (perhaps) very little about what resources it has.
Put differently: I think Eliezer’s given a definition of “intelligence” that could equally be given as a definition of “power”, and I suspect that in practice using it to evaluate intelligence involves applying some other notion of what counts as intelligence and what counts as something else. (E.g., we’ve already decided that how much money you have, or how many nuclear warheads you have at your command, don’t count as “intelligence”.)
If you have two agent source codes A and B both provided with resource amount R (detailing computational architecture and tools such as telepresence robots and nanotech) then you will observe that if A is a stronger optimizer than B, A will get more done in less time, or alternately hit a much higher preferred world.
If you have one agent source code C and run two instances, one with resources M and one with resources N, then if M > N, then agent running on M will dominate the agent running on N.
“Intelligence” is cleverness of source code, “Power” is the available resources. A really clever agent can outdo a stupid one even if ludicrously handicapped, resource wise. A stupid agent with powerful nanotech dominates a clever agent with human servants.
what about chaotic neutral?
This is very similar to an earlier post. Eliezer, go faster. I, for one, am waiting for some non-trivial FAI math—is there any?
Stars are optimisation processes. They maximise entropy production. See the work of R. C. Dewar. Dewar’s work was based on the work of E. T. Jaynes in this area.
Vladimir, there’s plenty of non-trivial decision theory math that other people have invented.
If I were going to write up a piece of novel FAI math, it would be my general theory of Newcomblike problems, but the last time I tried that it started turning into a book. But that would be the main thing that I would do, if it got to the point that impressing people with at least one piece of elegant math turned into a high enough priority.
The state spaces of both chess and go have to be expanded to include “previous history”. In the case of go, this is to avoid the ko problem—moves that would produce the exact same board state as a previous state are forbidden. Chess, well, see a chess rule book, or wikipedia.
The first time you head to the supermarket it might be trying to stop starving, but after you have done it for a few times, it becomes an in grained habit. And you might be heading to the super market, because you always head to the super market on a saturday. You might do so even if it becomes knowably sub optimal for starvation prevention, because you aren’t re-evaluating the economics of doing so.
This is a known and exploited facet of humans, hence special offers to get you buying food for a bit, to hope it becomes a habit.
I however sometimes spend far too long making food buying decisions (i.e. more time spent if I was working for that time than I might save).
If your AI was making a decision to buy bread vs making it would it calculate the nutrition of store bought bread, cost of petrol and bread, wear and tear on the car etc vs the time taken to make bread, energy costs of cooking it, ingredient costs, wear and tear on oven, cost to repair or replace oven etc every time one of these values changed or would it use cached results?
Of course if the AI was going to be truly powerful it would be able to predict the secondary effects of choices it made, would the money it spent in avoiding starvation be in the hands of other people trying to avoid starvation of people in general, i.e. giving it to people spending it to encourage research and actions that lead to the avoidance of food shortages in the future (either through lobbying government or directly depending what was more influential). I sometimes feel I need a very Powerful AGI just to buy my groceries.
I think it’s worth mentioning that Kasparov will have a harder time accurately predicting your moves than you will have predicting his. Each of you knows that Kasparov will win, but this will much more likely be due to a blunder on your part than a brilliancy on his. He may well reason, “sooner or later this patzer is going to hang a piece”, but he will have no way of knowing when.
Eliezer, what do you mean by “planning”? The word needs a technical definition. (So does “optimisation”, for that matter, by people on both sides of the claim that stars are, or are not, optimisers.)
Reaching goals does not necessarily involve planning. Water reaches the foot of a hill without planning. The room thermostat maintains the temperature without planning. It is said that no plan of battle survives contact with the enemy. I have a simulation of a robot that walks over uneven terrain and hunts food particles. That robot does no planning. I know: I created it. (Neither does it optimise, learn, adapt, remember, or predict. It merely works.) I know my way around the town I live in, and do not need to make any plan to reach the supermarket. I only need to know what to do at each point of the journey. In a strange town I would use a map and make a plan, but the plan would have to be continually updated according to conditions encountered.
To leap from observing the accomplishment of goals to “planning” is an anthropomorphisation of the sort that you have condemned in people working on AI, except when planning, technically defined, is demonstrated to actually be present in the mechanism. If “planning” does not mean some particular sort of mechanism, then you might as well call it “emergence”, or “pixies”.
By “steering” I understand continually correcting your course to maintain the approach to your goal. Steering is not planning. The thermostat steers the temperature and the robot steers to its prey without planning. Steering without planning is possible; planning without steering is useless. Relying on a plan makes it unreliable.
Anyone can test this. Give someone directions in the form of a plan of which way to go at each intersection. Don’t tell them the destination you’re aiming them at. Have them execute the plan and tell you where they got to.
Will, you still seem to think Eliezer is saying “do what an optimal agent would, even when doing so is not optimal.” No.
But then they’re not optimizing for a goal they can be said to have. If a street is closed, they won’t reach the destination, as they would if they knew what the destination was.
+1 to Will Pearson and Richard Kennaway. Humans mostly follow habit instead of optimizing.
Eliezer, this is interesting:
Some kind of bounded rationality? Could you give us a taste?
Tim Tyler: Stars are optimisation processes. They maximise entropy production.
No, at least, not by the definition of “optimisation processes” that Eliezer uses. Stars don’t plan to maximize entropy productions, gravity doesn’t plot a course down a hill so that a ball reaches the lowest point. Both can get stuck in a local maxima. Reread the post about Kasparov, and try to come up with a corresponding story about stars. Elizier is not talking about the same kind of thing as you are.
You are being ridiculous—“optimisation process” is a perfectly standard term.
“Planning” is not part of the definition of what constitutes an optimisation process.
an outcome that ranks high in your preference ordering
Well if Garry’s wins are in the centre of your preference ordering circle of course you’ll lose! Some fighting spirit please!
Oh, and if something maximising entropy is a valid optimisation process, then surely everything is an optimisation process and the term becomes useless? Optimisation processes lead (locally) away from maximal entropy, not towards it, right?
Emile, you’ve mixed up “optimization process” and “intelligence”. According to your post, Eliezer wouldn’t consider evolution an optimization process. He does; he doesn’t consider it intelligent.
Not really. A salt crystal is not an optimisation process—at least not on conventional timescales. However, certainly optimisation is a widespread, “universal” phenomena—driving most change in the universe and all self-organising systems—including biological ones.
That description certainly fits a star. The star is locally ordered (it’s a big, dense ball of matter), but it creates global disorder—in the form of heat and radiation.
However, I don’t think there is a general statement you can make about what optimisation processes do to “local” entropy levels. For a counter-example, consider gas expanding to fill a box. That is surely an optimisation process, and we know what solution it will converge on. However, there is no associated “optimising agent” using a heat pump to perform work to execute the task—and consequently there are not really any “local” increases in entropy occurring.
I meant “decreases”, of course. There might be local decreases if self-organising systems formed (e.g. if the gas became turbulent, and formed eddies), but that gets us into the realm of pedantic nit-picking.
Tim: I’m not offering a definition of “optimization process” or “intelligence”, I’m just pointing out that when you say “gravity is an optimization process”, or “stars are optimization processes”, you’re missing the whole point of the Kasparov post.
There’s a difference between the unpredictability of a ball rolling down a hill, and the unpredictability of Kasparov’s chess moves. This difference is not obvious.
Nick Tarleton: But then they’re not optimizing for a goal they can be said to have. If a street is closed, they won’t reach the destination, as they would if they knew what the destination was.
Neither will the person who has that goal, if they ignore the goal having made the plan.
People habitually overestimate the necessity of planning. Planning can be useful, but it is never enough, often unnecessary, and sometimes actively harmful.
You know, I’ve been thinking that I’m having trouble saying exactly what a “preference” is.
“that which an optimization process selects for” would seem to more or less declare everything an optimization process.
Anyone got a decent reduction on the notion of a preference itself? I’m kinda annoyed that I don’t seem to be able to quite work that out.
I was addressing the sentence: “But though stars burn longer and brighter than any craft of biology or human artifice, stars are neither optimization processes, nor products of strong optimization pressures.”—which, IMHO, is factually inaccurate. I do not see any evidence that I am missing anything.
“Preferences” is just another name for the values of agents.
Tim Tyler:
Yes, I understood that. I meant can you reduce it to non-mind? Besides, evolution is an optimization process, and I wouldn’t call it an agent that has values.
Either way, the point is that it’s just moving my confusion from the word “preferences” to the word “values”.
If I then say “those things an agent wants”, then I’m just stuck at the word “want” again. It’s not a reduction, just a repeated renaming of the same black box.
“Values” is another name for components of a utility function. One area where things get fuzzy is when you get to decide whether something qualifies as an agent or not—but that is a traditional problem which lies beyond the scope of this post.
Tim: no, I’d think of it in reverse, that a utility function is a very special type of encoding for a set of preferences.
Again, I’m not denying that I think I have an intuitive sense of what I think I mean by the term. It’s just that when I try to reduce it from something mind to something non mind, the best I can come up with is stuff like “that which an optimization process selects for”
At which point I have to declare everything an optimization process in some sense. (=I’m actually semisorta tempted to do this, to talk about optimization power as a property of processes in general, rather than distinguishing certain types of processes as optimization processes. This way I think I’d have a reasonably serviceable reduction of the notion of a preference. Except then with intelligent agents that aren’t logically omnicient and, say, can’t yet fully compute their morality (or primality or whatever as appropriate) and thus in a sense don’t actually fully know their preferences.
Well, there’s hopefully enough here to illustrate my confusion sufficiently that you or someone who’s actually worked out the correct answer can help me out here. I’m annoyed that I don’t know this. :)
FYI—the best chess player of the past 10 years has been a guy called Anand, and he is the best rapid chess player of all time. Maybe some day people will stop talking about Kasparov who retired a few years ago!
I was brought to full view of the picture in your last paragraph and I have to say.....You could have made this much shorter and much easier to understand (Especially for those “lacking”). Otherwise about half of that article is just you spinning your wheels going nowhere. Don’t get me wrong, the idea is interesting, but for the most part it could have been expressed with,
“There are an infinite amount of possibilities for anything to happen. That said, in our day to day routines only our own intelligence,experiences, and knowledge can determine the amount of success that we have in our lives.”
Right? It’s similar to poetry, as I say, “If it has to be explained to you by someone more knowledgeable then it defeats the point of expressing yourself”
I was brought to full view of the picture in your last paragraph and I have to say.....You could have made this much shorter and much easier to understand (Especially for those “lacking”). Otherwise about half of that article is just you spinning your wheels going nowhere. Don’t get me wrong, the idea is interesting, but for the most part it could have been expressed with,
“There are an infinite amount of possibilities for anything to happen. That said, in our day to day routines only our own intelligence,experiences, and knowledge can determine the amount of success that we have in our lives.”
Right? It’s similar to poetry, as I say, “If it has to be explained to you by someone more knowledgeable then it defeats the point of expressing yourself”