A Primer On Risks From AI
The Power of Algorithms
Evolutionary processes are the most evident example of the power of simple algorithms [1][2][3][4][5].
The field of evolutionary biology gathered a vast amount of evidence [6] that established evolution as the process that explains the local decrease in entropy [7], the complexity of life.
Since it can be conclusively shown that all life is an effect of an evolutionary process it is implicit that everything we do not understand about living beings is also an effect of evolution.
We might not understand the nature of intelligence [8] and consciousness [9] but we do know that they are the result of an optimization process that is neither intelligent nor conscious.
Therefore we know that it is possible for an physical optimization process to culminate in the creation of more advanced processes that feature superior qualities.
One of these qualities is the human ability to observe and improve the optimization process that created us. The most obvious example being science [10].
Science can be thought of as civilization-level self-improvement method. It allows us to work together in a systematic and efficient way and accelerate the rate at which further improvements are made.
The Automation of Science
We know that optimization processes that can create improved versions of themselves are possible, even without an explicit understanding of their own workings, as exemplified by natural selection.
We know that optimization processes can lead to self-reinforcing improvements, as exemplified by the adaptation of the scientific method [11] as an improved evolutionary process and successor of natural selection.
Which raises questions about the continuation of this self-reinforcing feedback cycle and its possible implications.
One possibility is to automate science [12][13] and apply it to itself and its improvement.
But science is a tool and its bottleneck are its users. Humans, the biased [14] effect of the blind idiot god that is evolution.
Therefore the next logical step is to use science to figure out how to replace humans by a better version of themselves, artificial general intelligence.
Artificial general intelligence, that can recursively optimize itself [15], is the logical endpoint of various converging and self-reinforcing feedback cycles.
Risks from AI
Will we be able to build an artificial general intelligence? Yes, sooner or later.
Even the unintelligent, unconscious and aimless process of natural selection was capable of creating goal-oriented, intelligent and conscious agents that can think ahead, jump fitness gaps and improve upon the process that created them to engage in prediction and direct experimentation.
The question is, what are the possible implications of the invention of an artificial, fully autonomous, intelligent and goal-oriented optimization process?
One good bet is that such an agent will recursively improve its most versatile, and therefore instrumentally useful, resource. It will improve its general intelligence, respectively cross-domain optimization power.
Since it is unlikely that human intelligence is the optimum, the positive feedback effect, that is a result of using intelligence amplifications to amplify intelligence, is likely to lead to a level of intelligence that is generally more capable than the human intelligence level.
Humans are unlikely to be the most efficient thinkers because evolution is mindless and has no goals. Evolution did not actively try to create the smartest thing possible.
Evolution is further not limitlessly creative, each step of an evolutionary design must increase the fitness of its host. Which makes it probable that there are artificial mind designs that can do what no product of natural selection could accomplish, since an intelligent artificer does not rely on the incremental fitness of each step in the development process.
It is actually possible that human general intelligence is the bare minimum. Because the human level of intelligence might have been sufficient to both survive and reproduce and that therefore no further evolutionary pressure existed to select for even higher levels of general intelligence.
The implications of this possibility might be the creation of an intelligent agent that is more capable than humans in every sense. Maybe because it does directly employ superior approximations of our best formal methods, that tell us how to update based on evidence and how to choose between various actions. Or maybe it will simply think faster. It doesn’t matter.
What matters is that a superior intellect is probable and that it will be better than us at discovering knowledge and inventing new technology. Technology that will make it even more powerful and likely invincible.
And that is the problem. We might be unable to control such a superior being. Just like a group of chimpanzees is unable to stop a company from clearing its forest [16].
But even if such a being is only slightly more capable than us. We might find ourselves at its mercy nonetheless.
Human history provides us with many examples [17][18][19] that make it abundantly clear that even the slightest advance can enable one group to dominate others.
What happens is that the dominant group imposes its values on the others. Which in turn raises the question of what values an artificial general intelligence might have and the implications of those values for us.
Due to our evolutionary origins, our struggle for survival and the necessity to cooperate with other agents, we are equipped with many values and a concern for the welfare of others [20].
The information theoretic complexity [21][22] of our values is very high. Which means that it is highly unlikely for similar values to automatically arise in agents that are the product of intelligent design, agents that never underwent the million of years of competition with other agents that equipped humans with altruism and general compassion.
But that does not mean that an artificial intelligence won’t have any goals [23][24]. Just that those goals will be simple and their realization remorseless [25].
An artificial general intelligence will do whatever is implied by its initial design. And we will be helpless to stop it from achieving its goals. Goals that won’t automatically respect our values [26].
A likely implication is the total extinction of all of humanity [27].
Further Reading
What should a reasonable person believe about the Singularity?
Artificial Intelligence as a Positive and Negative Factor in Global Risk
References
[1] Genetic Algorithms and Evolutionary Computation, talkorigins.org/faqs/genalg/genalg.html
[2] Fixing software bugs in 10 minutes or less using evolutionary computation, genetic-programming.org/hc2009/1-Forrest/Forrest-Presentation.pdf
[3] Automatically Finding Patches Using Genetic Programming, genetic-programming.org/hc2009/1-Forrest/Forrest-Paper-on-Patches.pdf
[4] A Genetic Programming Approach to Automated Software Repair, genetic-programming.org/hc2009/1-Forrest/Forrest-Paper-on-Repair.pdf
[5]GenProg: A Generic Method for Automatic Software Repair, virginia.edu/~weimer/p/weimer-tse2012-genprog.pdf
[6] 29+ Evidences for Macroevolution (The Scientific Case for Common Descent), talkorigins.org/faqs/comdesc/
[7] Thermodynamics, Evolution and Creationism, talkorigins.org/faqs/thermo.html
[8] A Collection of Definitions of Intelligence, vetta.org/documents/A-Collection-of-Definitions-of-Intelligence.pdf
[9] plato.stanford.edu/entries/consciousness/
[10] en.wikipedia.org/wiki/Science
[11] en.wikipedia.org/wiki/Scientific_method
[12] The Automation of Science, sciencemag.org/content/324/5923/85.abstract
[13] Computer Program Self-Discovers Laws of Physics, wired.com/wiredscience/2009/04/newtonai/
[14] List of cognitive biases, en.wikipedia.org/wiki/List_of_cognitive_biases
[15] Intelligence explosion, wiki.lesswrong.com/wiki/Intelligence_explosion
[16] 1% with Neil deGrasse Tyson, youtu.be/9nR9XEqrCvw
[17] Mongol military tactics and organization, en.wikipedia.org/wiki/Mongol_military_tactics_and_organization
[18] Wars of Alexander the Great, en.wikipedia.org/wiki/Wars_of_Alexander_the_Great
[19] Spanish colonization of the Americas, en.wikipedia.org/wiki/Spanish_colonization_of_the_Americas
[20] A Quantitative Test of Hamilton’s Rule for the Evolution of Altruism, plosbiology.org/article/info:doi/10.1371/journal.pbio.1000615
[21] Algorithmic information theory, scholarpedia.org/article/Algorithmic_information_theory
[22] Algorithmic probability, scholarpedia.org/article/Algorithmic_probability
[23] The Nature of Self-Improving Artificial Intelligence, selfawaresystems.files.wordpress.com/2008/01/nature_of_self_improving_ai.pdf
[24] The Basic AI Drives, selfawaresystems.files.wordpress.com/2008/01/ai_drives_final.pdf
[25] Paperclip maximizer, wiki.lesswrong.com/wiki/Paperclip_maximizer
[26] Friendly artificial intelligence, wiki.lesswrong.com/wiki/Friendly_artificial_intelligence
[27] Existential Risk, existential-risk.org
- 12 Apr 2012 0:57 UTC; 7 points) 's comment on against “AI risk” by (
I’m intrigued as to the thought processes and motivations which lead to this article in light of your previous two weeks of comments and posts.
I realized that I might have entered some sort of vicious circle of motivated skepticism.
I can’t ask other people to explore both sides of an argument if I don’t do so either.
Someone wrote that I shouldn’t ask AI researchers about risks from AI if I don’t understand the basic arguments underlying the possibility.
I was curious if my perception of the arguments in favor of risks from AI is flawed and if I am missing important points. Since I haven’t read the Sequences.
I recently wrote that I agree with 99,99% of what Eliezer Yudkowsky writes. The number was wrong. But I wanted to show that it isn’t just made up.
I don’t perceive myself to be a troll at all. Although some unthoughtful comments might have given that impression.
Although it looks like that everyone hates me now, I still don’t want to be wrong.
I know that not having read the Sequences is received badly. Especially since I posted a lot in the past. But that’s not some incredible evil plan or anything. I am unable to play games I want to play for longer than 20 minutes either. Yet I have to do physical exercises every day for like 2 hours, even though I don’t really want to. It sometimes takes me months to read a single book. I think some here underestimate how people can act in a weird way without being evil. I am in psychiatric therapy for 3 years now (yeah, I can prove this).
I can neither get myself to read the Sequences nor am I able to ignore risks from AI. But I am trying.
Thank you for explaining.
I think you’re an important guy to have around for reasons of evaporative cooling.
I like the combination of conciseness and thoroughness you’ve achieved with this.
There are a couple of specific parts I’ll quibble about:
“The Automation of Science” section seems weaker to me than the others, perhaps even superfluous. I think the line I’ve quoted is the crux of the problem; I highly doubt that the development of AGI will be driven by any such motivations.
I assign a high probability to the proposition that we will be able to build AGI, but I think a straight “yes” is too strong here.
Agreed—AGI will probably not be developed with the aim of improving science.
I also want to quibble about this:
Since most readers don’t want to be replaced, at least in one interpretation of that term, this line sticks in the throat and breaks the flow. The natural response is something like “logical? According to whose goals?”
New York city is complex—yet it exists. Linux is complex—yet it exists. Something being in a tiny corner of a search space doesn’t mean it isn’t going to be hit.
Nobody argues that complex values will “automatically arise” in machines. They will be built in—in a similar way to the way car air bags were built in—or safety features on blenders were built in.
NYC and Linux were built incrementally. We can’t easily test a super intelligent AI’s morality in advance of deploying it. And the probability of failure is conjunctive, since getting just one thing wrong means failure.
We can and will test intelligent machines—in much the same way as we test them today.
Testing machines may not be “easy”—but it isn’t rocket science. You put the testee in a virtual world and test them there.
A prison composed by the previous generation can be made plenty secure.
An escaped criminal on the run doesn’t have much of chance of overtaking the whole of the rest of society and its technology.
Getting one thing wrong is rarely fatal. We have failed thousands of times already—and there will be plenty more failures. There is a sense in which we have “only one chance of reaching 2100”—but I think this “only one chance” business is essentially engineered confusion and scaremongering. There isn’t really “only one chance”—since plenty of mistakes can be made.
It’s a lot harder than rocket science.
Aren’t rockets machines?
I test machines every day. It doesn’t seem to be that difficult to me.
The point was that this problem is framed badly. Humans won’t be testing a super intelligent machine’s morality. There’s the man-machine symbiosis to consider.
Hm?
Machines get tested by humans in conjunction with machines. When I test machines I use other machines to test them with. As we go along, machines will do more and more of the work. The testers will be getting smarter in addition to the testee. This isn’t a case of: humans on one side, machines on the other.
What if the testee realizes they are being tested and behaves differently than they would if unboxed? Security by obscurity doesn’t work well even against humans, so it seems best to use schemes that work even if the testee knows everything about them.
Furthermore, do you think a group of monkeys could design a cage that would keep you trapped?
http://lesswrong.com/lw/qk/that_alien_message/
Are there any historical cases of superintelligent escaped criminals? You sound awfully confident about a scenario that has no historical precedent.
Then, if you identify that as being a problem. you redesign your test harness.
Probably not—but that isn’t a terribly good analogy to any problem we are likely to face.
Well, of course not—though I do seem to recall a tale of one General Zod.
I’m doubting whether the situation with no historical precedent will ever come to pass. We have had escaped criminals in societies of their peers. In the future, we may still have some escaped criminals in societies of their peers - though hopefully a lot fewer.
What I don’t think we are likely to have is an escaped superintelligent criminal in an unadvanced society. Instead, I expect that a society able to produce such an agent will already be quite advanced—and that society as a whole will be able to advance faster than any escaped criminals will be able to manage—due to having more resources, manpower, etc.
It sounds to me like you are favoring the “everything’s going to be all right” conclusion quite heavily. You act like everything is going to be all right by default, and your arguments for why things will be all right aren’t very sophisticated.
And we will certainly identify it as being a problem because humans know everything and they never make mistakes.
I see, similar to how housing prices will never drop? Have you read up on black swans?
We are venturing into uncharted territory here. Historical precedents provide very weak information.
No.
Yes.
I don’t think it is likely that the world will end in accidental apocalypse in the next century.
Few do—AFAICS—and the main proponents of the idea are usually selling something.
What level on the disagreement hierarchy would you rate this comment of yours?
http://www.paulgraham.com/disagree.html
It looks like mostly DH3 to me, with a splash of DH1 in implying that anyone who suggests that our future isn’t guaranteed to be bright must be selling something.
There’s a bit of DH4 in implying that this is an uncommon position, which implies very weakly that it’s incorrect. I don’t think this is a very uncommon position though:
http://www.ted.com/talks/lang/en/martin_rees_asks_is_this_our_final_century.html
http://www.ted.com/talks/stephen_petranek_counts_down_to_armageddon.html
http://www.ted.com/talks/jared_diamond_on_why_societies_collapse.html
http://www.wired.com/wired/archive/8.04/joy.html
And Stephen Hawking on AI:
http://www.zdnet.com/news/stephen-hawking-humans-will-fall-behind-ai/116616
That’s a fair analysis of those two lines—though I didn’t say “anyone ”.
For evidence for “uncommon”, I would cite the GLOBAL CATASTROPHIC RISKS SURVEY RESULTS. Presumably a survey of the ultra-paranoid. The figures they came up with were:
Number killed by molecular nanotech weapons: 5%.
Total killed by superintelligent AI: 5%.
Overall risk of extinction prior to 2100: 19%
Interesting data, thanks.
Out of curiosity, what are your current thoughts on the arguments you’ve laid out here?
Strong enough to justify the existence of an organisation like SIAI. Everything else is a matter of expected utility calculations. Which I am not able to handle. Not given my current education and not given my psyche.
I know how what I am saying is incredible repugnant to some people here. I see no flaws. But I can’t help but flinch away from taking all those ideas seriously. Although I am currently trying hard. I suppose the post above is a baby-step.
This video pretty much is the window to my soul. You see how something can be completely rational yet feel ridiculous?
Less Wrong opens up the terrifying vistas of reality that I tried to flee from since a young age.
-- The Call of Cthulhu
I felt compelled to try and see if I can make it all vanish.
I think I understand how you feel. Here is what I propose, for people who find these vistas of reality terrifying, and who may feel a need to approach them from a more “spiritual” (for lack of a better word) perspective: a true Singularity cult. By that I mean, no more pretending that you are a mere rationalist, coolly calculating the probabilities of heaven and hell, but rather to embrace the quasi-religious nature of this subject matter in all its glory. I have a pretty clear vision of such a cult, its ideology, activities and structure, and would like to know if anyone here is interested in such a thing. What I have in mind would be rather extreme and terrifying to the profane, and hence is better discussed in a more cult-like environment. For example, from the point of view of the “Cult of Omega”, the extinction of humanity is an all but inevitable and desirable outcome, as we march ineluctably toward the Singularity. I believe that if it was done well, such a cult could become the nexus of a powerful new religion which could totally remake the world.
Sure. I’m interested in all end-of-the-world cults. The more virulent their memes the better.
A cult with no respect for history? Those who don’t remember the past are doomed to repeat it.
Excellent! Perhaps you can be BetaOmega ;)
As far as history goes, there are some chapters that might be worth repeating. For example, what possessed the ancient Egyptians, suddenly and out of the stone age, to build huge monuments of great precision which still awe us after 4.5 thousand years? Some crazy pharaohnic cult made that possible, and even though it seems totally irrational, I’m glad they did it! So maybe this is what we need today: a cult of the Machine which gives our technology an ideology, and even a religion. Otherwise it all seems rather pointless, doesn’t it?
Please don’t be too put off by my web site by the way—I was in a comic book supervillain phase when I created it which I’m finally getting over. Nor am I here to troll LessWrong; I think what has been created here is brilliant, and though it’s often accused of being cultish, maybe the real problem is that it isn’t cultish enough!
I don’t think so. It increases entropy, in fact.
http://www.amazon.com/Evolution-Entropy-Science-Conceptual-Foundations/dp/0226075745
There are also counterexamples of technologically underprivileged groups resisting quite successfully. I think there might be a chapter on this in War Before Civilization.
Beware positive bias.
Regarding the vast ‘mind design space’, it gets infinitely smaller when you are to stop considering the theoretical stuff based on oracles and realize that the classical computing AI—the one still competing with us for resources—can only square or cube current processing power before it runs out of things in the universe.
Let’s say, to the 4th power, just to cover our bases. The computational complexity of e.g. weather forecasting (or forecasting of any other nonlinear phenomena) grows at least as exponent of the time—with no shortcuts—and so your superhuman intelligence is not even a very impressive weather forecaster, with 4 times longer span of forecast. It is also not very impressive self-forecaster. And it is not very impressive at handling nonlinearly coupled unknowns (for which it must simulate all combinations to get the future utility, i.e. 100 unknowns 10 values each = space of 10^100).
It just operates based on time-local strategies with rules like ‘assign negative utility to actions you can’t undo’ (proportional to accuracy of undoing) , ‘assign the positive utility to the logarithm of number of available choices’, ‘assign positive utility to collection of interesting information’, as well as more sophisticated, complementary ones—which it would come up with not by forecasting but by testing strategies on hypothetical scenarios. The end result likely won’t even resemble straightforward utility maximization any more than human behaviour resembles utility maximization. It would be more optimal, but once again, not in the sense of outperforming theoretical utility maximization at maximizing utility, but in the sense of trading accuracy for speed better.
Bottom line is, the AIs that fit inside our universe, are only a very tiny fraction of mind design space; the very scary super psychopathic monsters were pulled out of other parts of the mind design space, far off in the area where gods live.
Not to say that there is absolutely no risk in the AIs, but the risk is of entirely different kind, arising from entirely different kind of entities. One should be careful not to set off the time-local strategies that destroy you—overt unfriendliness towards AI may set off one or other strategic solution, and lead to more, or less discriminative response.
Unless it develops better quantum computing or exploits other strange physical phenomena we don’t know about.
At which time all bets are off with regards to whenever it would even compete for resources. A lot of stuff of this kind can happen, e.g. discovering that we are in simulation.
The bottom line is, now that we established limitations, the AI risks better be prefaced with “If AI develops awesome quantum computing, but it still needs atoms from your body, the following theoretizations about godlike AIs might apply:”.
Furthermore, as the AI has to start off with the low hanging fruit—optimizing itself on commodity hardware—the argumentation is not about random points in the awesome quantum mind design space, but our original classical AI’s creations, subject to original AI getting perfectly ordinary cold feet about the transition due to wide variety of heuristics that scream ‘no, that change is too big and unpredictable’, and the AI trying to preserve itself through the transition.
But we don’t know how much it can improve its algorithms before it hits the theoretical limit of efficient resource utilization and has to start expanding outward. You sound a bit like you’re assuming that human brains are already near that limit (so that the only way that an AI could beat us was by grabbing lots and lots of resources). So resource-boundedness doesn’t really tell us anything about the upper bound on AI harmfulness. That’s one of the ways you could try to defeat the argument about AI risk. Another would be arguing about the thesis that AIs are likely to be harmful if safety isn’t carefully engineered into them. I don’t see how we could deduce anything reassuring about that issue from the fact that AIs will have bounded resources.
You forget one important bit: There are other sources of insights about AI than human analogies and speculations what oracle would do.
It stands that no amount of optimization of AI’s ‘predictor of the future’ can beat Lyapunov’s exponent ; the AI, however effective, still can’t forecast jack shit. On top of that, the AI has enormous number of actions it can take, far far larger than it can process if it were to process them by forecasting the outcomes very accurately.
The very important optimization is not thinking about stuff that has low payoff/thought ratio. Long forecasts have very low payoff, as the cost is exponential in time. Inventions on the other hand should pay off very well. Survival of mankind is not on hand of a calculation ‘okay, action A destroys mankind and action B does not, and action A leads to ever so slightly higher utility 1000 years from now; the mankind got to go’. That sort of forecasting is infeasible. The one with short time span is foolish and near-sighted, while the far future is unknown.
It’s on hands of some general purpose, effective strategies of the kind ‘it is valuable to maximize the choices in the future’ , ‘information is valuable’ (curiosity), ‘penalize actions depending to how badly you can undo them’. Versus ‘if they all act paranoid on Eliezer’s suggestion, they might try damage me’. (Though, hopefully, if the AI is engineering a virus against mankind, the virus won’t exterminate but would medicate for paranoia because that solves immediate problem just as well while leaving more options for future and avoiding actions that can’t be undone even approximately). We are still made of atoms, that AI could use, but there’s quite a plenty of other atoms around. I’d be more worried about it wanting our computation hardware (brains). Crappy it might be, but it is already around, and doesn’t need to be manufactured.
It is very ineffective to just go ahead and limit future options and do entirely irreversible things simply because you don’t have simple expected utility based answer to”why not”. The future you will still be trying to achieve what ever goals you want to achieve, and choosing among choices it has available, and giving the future self more choices is extremely solid heuristic even though you can’t straightforwardly calculate expected utility of doing so (due to recursion).
The oracle is the first mental superpower (the idea dates back quite a while), the least feasible, but the easiest for uneducated to speculate about. It is the easiest way to portray super-intelligence without being super intelligent—why, the super intelligence just knows complete outcomes of it’s actions and chooses best action. That’s easy to think of, and also is extremely inefficient approach to maximization of anything.
The point here is not that AI would necessarily be unable to get rid of mankind. The point is that AI is not particularly more likely to do so than the mankind itself (and may well be less likely). The risks are differential. People are prone to retarded ideologies too. Other people are not you-friendly intelligences, which is totally obvious when you are not living all your life in the privileged class. Groups of other people are unfriendly non-you intelligences, too, highly dangerous and prone to hurting you even if it hurts them as well. Presumably the AI will at least be friendly enough not to hurt you on it’s own expense; you can’t assume even this rudimentary friendliness of your fellow ferocious survival machines, crazed and meme-infested to the brim. It remains to be shown that AI is any more of the existential risk than human all natural stupidity. When one’s speculating up scary stuff one can speculate up the scary human ideologies and social orders as easily as scary AI goal systems. The AI can stop us from killing ourselves, or may kill us, that is not yet a risk until you show that the former is significantly less than latter.