Eliezer Yudkowsky comments on Recursive Self-Improvement

Eliezer Yudkowsky Dec 2, 2008, 8:39 AM
3 points
(Compound reply from Eliezer.)

Eliezer: When you fold a complicated, choppy, cascade-y chain of differential equations in on itself via recursion, it should either flatline or blow up. You would need exactly the right law of diminishing returns to fly through the extremely narrow soft takeoff keyhole.

Goetz: This is the most important and controversial claim, so I’d like to see it better-supported. I understand the intuition; but it is convincing as an intuition only if you suppose there are no negative feedback mechanisms anywhere in the whole process, which seems unlikely.

Can you give a plausible example of a negative feedback mechanism as such, apart from a law of diminishing returns that would be (nearly) ruled out by historical evidence already available?

I suspect that human economic growth would naturally tend to be faster and somewhat more superexponential, if it were not for the negative feedback mechanism of governments and bureaucracies with poor incentives, that both expand and hinder whenever times are sufficiently good that no one is objecting strongly enough to stop it; when “economic growth” is not the issue of top concern to everyone, all sorts of actions will be taken to hinder economic growth; when the company is not in immediate danger of collapsing, the bureaucracies will add on paperwork; and universities just go on adding paperwork indefinitely. So there are negative feedback mechanisms built into the human economic growth curve, but an AI wouldn’t have them because they basically derive from us being stupid and having conflicting incentives.

What would be a plausible negative feedback mechanism—as apart from a law of diminishing returns? Why wouldn’t the AI just stomp on the mechanism?

Hanson: Depending on which abstractions you emphasize, you can describe a new thing as something completely new under the sun, or as yet another example of something familiar. So the issue is which abstractions make the most sense to use. We have seen cases before where when one growth via some growth channel opened up more growth channels, to further enable growth. So the question is how similar those situations are to this situation, where an AI getting smarter allows an AI to change its architecture in more and better ways. Which is another way of asking which abstractions are most relevant.

Well, the whole post above is just putting specific details on that old claim, “Natural selection producing humans and humans producing technology can’t be extrapolated to an AI insightfully modifying its low-level brain algorithms, because the latter case contains a feedback loop of an importantly different type; it’s like trying to extrapolate a bird flying outside the atmosphere or extrapolating the temperature/compression law of a gas past the point where the gas becomes a black hole.”

If you just pick an abstraction that isn’t detailed enough to talk about the putative feedback loop, and then insist on extrapolating out the old trends from the absence of the feedback loop, I would consider this a weak response.

Pearson: I think you have a tendency to overlook our lack of knowledge of how the brain works. You talk of constant brain circuitry, when people add new hippocampal cells through their life. We also expand the brain areas devoted to fingers if we are born blind and use braille.

Pearson, “constant brains” means “brains with constant adaptation-algorithms, such as an adaptation-algorithm for rewiring via reinforcement” not “brains with constant synaptic networks”. I think a bit of interpretive charity would have been in order here.

Finney: I’d like to focus on the example offered: “Write a better algorithm than X for storing, associating to, and retrieving memories.” Is this a well defined task? Wouldn’t we want to ask, better by what measure? Is there some well defined metric for this task?

Hal, if this is taking place inside a reasonably sophisticated Friendly AI, then I’d expect there to be something akin to an internal economy of the AI with expected utilons as the common unit of currency. So if the memory system is getting any computer time at all, the AI has beliefs about why it is good to remember things and what other cognitive tasks memory can contribute to. It’s not just starting with an inscrutable piece of code that has no known purpose, and trying to “improve” it; it has an idea of what kind of labor the code is performing, and which other cognitive tasks that labor contributes to, and why. In the absence of such insight, it would indeed be more difficult for the AI to rewrite itself, and its development at that time would probably be dominated by human programmers pushing it along.

Ian C.: Eliezer, would a human that modifies the genes that control how his brain is built qualify as the same class of recursion (but with a longer cycle-time), or is it not quite the same?

Owing to our tremendous lack of insight into how genes affect brains, and owing to the messiness of the brain itself as a starting point, we would get relatively slow returns out of this kind of recursion even before taking into account the 18-year cycle time for the kids to grow up.

However, on a scale of returns from ordinary investment, the effect on society of the next generation being born with an average IQ of 140 (on the current scale) might be well-nigh inconceivable. It wouldn’t be an intelligence explosion; it wouldn’t be the kind of feedback loop I’m talking about—but as humans measure hugeness, it would be huge.

Reid: I’m sure you’re aware of Schmidhuber’s forays into this area with his Gödel Machine. Doesn’t this blur the boundaries between the meta-cognitive and cognitive?

Schmidhuber’s “Gödel Machine” is talking about a genuine recursion from object-level to metacognitive level, of the sort I described. However, this problem is somewhat more difficult than Schmidhuber seems to think it is, to put it mildly—but that would be part of the AIXI sequence, which I don’t think I’ll end up writing. Also, I think some of Schmidhuber’s suggestions potentially hamper the system with a protected level.

Vassar: OTOH, it bizarrely appears to be the case that over a large range of chess ranks, human players seem to gain effective chess skill measured by chess rank with roughly linear training while chess programs gain it via exponential speed-up.

I expect that what you’re looking at is a navigable search space that the humans are navigating and the AI is grasping through brute-force techniques—yes, Deep Blue wasn’t literally brute force, but it was still navigating raw Chess rather than Regularity in Chess. If you’re searching the raw tree, returns are logarithmic; the human process of grokking regularities seems to deliver linear returns over practice with a brain in good condition. However, with Moore’s Law in play (exponential improvements delivered by human engineers) the AIs outran the brains.

Humans getting linear returns where dumb algorithms get logarithmic returns, seems to be a fairly standard phenomenon in my view—consider natural selection trying to go over a hump of required simultaneous changes, for example.

Tim Tyler: Brainpower went into making new brains historically—via sexual selection. Feedback from the previous generation of brains into the next generation has taken place historically.

If no one besides me thinks this claim is credible, I’ll just go ahead and hold it up as an example of the kind of silliness I’m talking about, so that no one accuses me of attacking a strawman.

(Quick reductio: Imagine Jane Cavewoman falling in love with Johnny Caveman on the basis of a foresightful extrapolation of how Johnny’s slightly mutated visual cortex, though not useful in its own right, will open up the way for further useful mutations, thus averting the unforesightful basis of natural selection… Sexual selection just applies greater selection pressure to particular characteristics; it doesn’t change the stupid parts of evolution at all—in fact, it often makes evolution even more stupid by decoupling fitness from characteristics we would ordinarily think of as “fit”—and this is true even though brains are involved. Missing this and saying triumphantly, “See? We’re recursive!” is an example of the overeager rush to apply nice labels that I was talking about earlier.)

Drucker: The problem, as I see it, is that you can’t take bits out of a running piece of software and replace them with other bits, and have them still work, unless said piece of software is trivial… The human brain is a mass of interconnecting systems, all tied together in a mish-mash of complexity. You couldn’t upgrade any one part of it by finding a faster replacement for any one section of it. Attempting to perform brain surgery on yourself is going to be a slow, painstaking process, leaving you with far more dead AIs than live ones.

As other commenters pointed out, plenty of software is written to enable modular upgrades. An AI with insight into its own algorithms and thought processes is not making changes by random testing like it was bloody evolution or something. A Friendly AI uses deterministic abstract reasoning in this case—I guess I’d have to write a post about how that works to make the point, though.

A poorly written AI might start out as the kind of mess you’re describing, and of course, also lack the insight to make changes better than random; and in that case, would get much less mileage out of self-improvement, and probably stay inert.