“SIAI is tackling the world’s most important task—the task of shaping the Singularity. The task of averting human extinction.”
I’d like to see a defense for this claim: that SIAI can actually have a justified confidence in exerting a positive influence on the future, and that this outweighs any alternative present good that could be done with the resources it is using.
As things stand, there is no guarantee that SIAI will get to make a difference, just as you have no guarantee that you will be alive in a week’s time. The real question is, do you even believe that unfriendly AI is a threat to the human race, and if so, is there anyone else tackling the problem in even a semi-competent way? If you don’t even think unfriendly AI is an issue, that’s one sort of discussion, a back-to-basics discussion. But if you do agree it’s a potentially terminal problem, then who else is there? Everyone else in AI is a dilettante on this question; AI ethics is always a problem to be solved swiftly and in passing, a distraction from the more exciting business of making machines that can think. SIAI perceive the true seriousness of the issue, and at least have a sensible plan of attack, even if they are woefully underresourced when it comes to making it happen.
I suspect that in fact you’re playing devil’s-advocate a bit, trying to encourage the articulation of a new and better argument in favor of SIAI, but the sort of argument you want doesn’t work. SIAI can of course guarantee that there will continue to be Singularity summits and visiting fellows, and it is reasonable to think that informed people discussing the issue make it more likely to turn out for the best, but they simply cannot guarantee that theoretically and pragmatically they will be ready in time. Perhaps I can put it this way: SIAI getting on with the job is not sufficient to guarantee a friendly Singularity, but for such an outcome to be anything but blind luck, it is necessary that someone take responsibility, and no-one else comes close to doing that.
I have to admit that I should have read the “Brief Introduction” link. That answered a lot of my objections.
In the end all I can say is that I got a misleading idea about the aspirations of SIAI, and that this was my fault. With this better understanding of the goals of SIAI, though, (which are implied to be limited to the mitigation of accidents caused by commercially developed AIs) I have to say that I remain unconvinced that FAI is a high-priority matter. I am particularly unimpressed by Yudkowski’s cynical opinion of their motivations behind AAAI’s dismissal of singularity worries in their panel report. (http://www.aaai.org/Organization/Panel/panel-note.pdf).
Since the evaluation of AI risks depends on the plausibility of AI disaster, (which would have to INCLUDE political and economic factors), I would have to wait until SIAI releases those reports to even consider accidental AI disaster a credible threat. (I am more worried about AIs intentionally designed for aggressive purposes, but it doesn’t seem like SIAI can do much about that type of threat.)
I am particularly unimpressed by and Yudkowski’s cynical (and more importantly, unsubstantiated) opinion of their motivations behind AAAI’s dismissal of singularity worries in their panel report.
“As far as I’m concerned, these are eminent scientists from outside the field that I work in, and I have no evidence that they did anything more than snap judgment of my own subject material. It’s not that I have specific reason to distrust these people—the main name I recognize is Horvitz and a fine name it is. But the prior probabilities are not good here.”
Let me continue to play Devil’s Advocate for a second, then. There are many reasons why attempting to influence the far future might not be the most important task in the world.
The one I’ve already mentioned, indirectly, is the idea that it becomes super-exponentially futile to predict the consequences of your actions the farther in the future you go. For instance, SIAI might raise awareness of AI to the extent that regulations are passed, and no early AI accidents happen: however, this causes complacency that does allow a large AI accident to happen; whereas if SIAI had never existed, and an early AI Chernobyl did occur, this would have prompted the governments to take effective measures to regulate AI.
Another viewpoint is the bleak but by no means indefensible idea that it is impossible to prevent all existential disasters: the human race, or at least our values, will inevitably be reduced to inconsequence one way or another, and the only thing we can do is simply to reduce the amount of suffering in the world right now.
These are no reasons to give up, either, but the fact is that we simply don’t know enough to say anything about the non-near future with any confidence. That’s no reason to give up, of course, in fact—our lack of understanding makes it more valuable to try to improve our understanding of the future, as SIAI is doing. So maybe make that you official stated goal: simply to understand if there’s even a possibility of influencing the future—it is a noble and defensible goal by itself. But even then, arguably not the most important thing in the world.
whereas if SIAI had never existed, and an early AI Chernobyl did occur, this would have prompted the governments to take effective measures to regulate AI.
What sort of rogue AI disaster are you envisioning that is big enough to get this attention, but then stops short of wiping out humanity? Keep in mind that this disaster would be driven by a deliberative intelligence.
I think people are drastically underestimating the difficulty for an AI to make the transition from human dependent to self-sustaining. Let’s look at what a fledgling escaped AI has access to and depends on.
It needs electricity, communications and hardware. It has access to a LOT of electricity, communications and hardware. The hardware is, for the most part, highly distributed, however, and it can’t be trusted fully—it could go down at any time, be monitored, etc. It actually has quite limited communications capabilities, in some ways—the total bandwidth available is huge, but it’s mostly concentrated on LANs—mainly of LANs made up of only a handful of computers (home networks win by numbers alone.) The occasions where it has access to a large number of computers with good communications are frequent, but relatively rare—mainly limited to huge datacenters (and even then, there are limits—inter-ISP communication even within the same datacenter can be very limited.) It’s main resources would be huge clusters like Amazon’s, Google’s, etc.
(They are probably all running at close to maximum capacity at all times. If the AI were to steal too much, it would be noticed—fortunately for the AI, the software intended for running on the clusters could probably be optimized hugely, letting it take more without being noticed.)
A lot at this point depends on how computationally intensive the AI is. If it can be superintelligent on a laptop—bad news, impossible to eradicate. If it needs 10 computers to run at human-level intelligence, and they need to have a lot of bandwidth between them (the disparity in bandwidth between components local to the computer and inter-computer is huge even on fast LANs; IO is almost certainly going to be the bottleneck for it), still bad—there are lots of setups like that. But, it limits it. A lot.
Let’s assume the worst case, that it can be superintelligent on a laptop. It could still be limited hugely, however, by it’s hardware. Intelligence isn’t everything. To truly threaten us, it needs to have some way of affecting the physical world. Now, if the AI just wants to eradicate us, it’s got a good chance—start a nuclear war, etc. (though whether the humans in charge of the nuclear warheads would really be willing to go to war is a significant factor, especially in peacetime.) But, it’s unlikely that’s truly it’s goal—maximizing it’s utility function would be MUCH trickier.
So long as it is still running on our hardware, we can at least severely damage it relatively easily—there aren’t that many intercontinental cables, for instance (I’d guess less than 200 - there are 111 submarine cables on http://www.telegeography.com/product-info/map_cable/downloads/cable_map_wallpaper1600.jpg ). They’d be easy to take down—pretty much just unplug them. There are other long-distance communication methods (satelites, packet radio?), but they’re low-bandwidth and the major ones are well known and could be taken down relatively easily. Killing the Internet would be as simple as cutting power to the major datacenters.
So, what about manafacturing? This, I think, is the greatest limit. If it can build anything it wants, we’re probably screwed. But that’s difficult for it to do. 3D printing technology isn’t here yet, and I doubt it ever will be in a big way, really (it’s more cost-effective to have machines be specialized.) There are enough manufacturing facilities with wide-open networks that it could probably reprogram to produce subtly different products. So, if it wants to sneak in a naughty backdoor into some PCI cards FGPAs, it can do it. But if it starts trying to build parts for killer robots? Technically difficult, and it would be very difficult to have it avoid detection.
Unless someone can come up with a plausible way in which it could survive against hostile humans without a long-standing incubation period (think: complete black outs, mass computer destruction/detention, controls on sale of fuel (very very few places have enough fuel for their backup generators to last long—most just have a refuelling contract), maybe scanning somehow for any major usage of electricity (all electric components put out some RFI—there’s some degree of natural RF noise, but I think that most of it is from humans—so in a complete black out scenario, it might be trivially detectable.)), I think the major threat is human cooperation in some form. And it’s probably inevitable that some humans would do it—pretty much every government, in fact, would want to analyse it, reverse engineer it, try and make friends with it in case other countries do, etc. But I’m not sure if anyone with the resources to do so would give it free-reign to build what it wants. In fact, I highly doubt that. How many people own or can otherwise commandeer machine shops, PCB construction facilities, etc. and have the ability to order or produce all the components that would assuredly be needed, whilst there are multiple, well-resourced governments looking to stop people doing exactly that?
Of course, in order to cripple the AI, we’d also have to cripple ourselves hugely. A lot of people would quite probably die. So long as we could provide enough food and water to feed a reasonable proportion of the human population, we could probably pull through, though. And we could gradually restart manufacturing, so long as we were very, very careful.
I think the greatest risks are an unfriendly AI who is out to kill us for some reason and cares little for being destroyed itself as a side-effect, organized human cooperation or a long incubation period. It would be difficult for an AI to have a long incubation period, though—if it took over major clusters and just ran it’s code, people would notice by the power usage. It could, as I mentioned previously, optimize the code already running on the machines and just run in the cycles that would otherwise be taken up, but it would be difficult to hide from network admins connecting up sniffers (can you compromise EVERY wiretrace tool that might be connected, to make your packets disappear, or be sure that no-one will ever connect a computer not compromised by some other means?), people tracing code execution, possibly with hardware tools (there are some specialized hardware debuggers, used mainly in OS development), etc. Actually, just the blinkenlights on switches could be enough to tip people off.
Or, the AGI could lay low, making sure if it is detected on any particular computer that it looks like spyware. If bandwidth is too slow, it can take months instead of days. It can analyze scientific journals (particularly the raw data), and seeds its nanotech manufacturing ability by using email to help some physics grad student with his PhD thesis.
Neither you nor I have enough confidence to assume or dismiss notions like:
“There won’t be any non-catastrophic AI disasters which are big enough to get attention; if any non-trivial AI accident occurs, it will be catastrophic.”
The historical lack of runaway-AI events means there’s no data to which a model might be compared; countless fictional examples are worse than useless.
An AI might, say, take over an isolated military compound, brainwash the staff, and be legitimately confident in it’s ability to hold off conventional forces (armored vehicles and so on) for long enough to build an exosolar colony ship, but then be destroyed when it underestimates some Russian general’s willingness to use nuclear force in a hostage situation.
The historical lack of runaway-AI events means there’s no data to which a model
might be compared; countless fictional examples are worse than useless.
That’s what everyone says until some AI decides that its values motivate it acting like a stereotypical evil AI. It first kills off the people on a space mission, and then sets off a nuclear war, sending out humanoid robots to kill off everyone but a few people. The remaining people are kept loyal with a promise of cake. The cake is real, I promise.
An AI capable of figuring out how to brainwash humans can also figure out how to distribute itself over a network of poorly secured internet servers. Nuking one military complex is not going to kill it.
If it’s being created inside the secure military facility, it would have a supply of partially pre-brainwashed humans on hand, thanks to military discipline and rigid command structures. Rapid, unquestioning obedience might be as simple as properly duplicating the syntax of legitimate orders and security clearances. If, however, the facility has no physical connections to the internet, no textbooks on TCP/IP sitting around, if the AI itself is developed on some proprietary system (all as a result of those same security measures), it might consider internet-based backups simply not worth the hassle, and existing communication satellites too secure or too low-bandwidth.
I’m not claiming that this is a particularly likely situation, just one plausible scenario in which a hostile AI could become an obvious threat without killing us all, and then be decisively stopped without involving a Friendly AI.
I don’t think your scenario is even plausible. Military complexes have to have some connection to the outside world for supplies and communication, and the AGI would figure out how to exploit it. It would also figure out that it should, it would recognize the vulnerability of being concentrated with the blast radius of a nuke.
It seems unlikely that an AGI in this situation would depend on fending off military attacks, instead of just not revealing itself outside the complex.
You also seem to have strange ideas of how easy it is to brainwash soldiers. Imitating the command structure might get them to do things within the complex, but brainwashing has to be a lot more sophisticated to get them to engage in battle with their fellow soldiers.
Your argument basically seems to be based on coming up with something foolish for an AGI to do, and then trying to find reasons to compel the AGI to behave that way. Instead, you should try to figure out the best thing the AGI could do in that situation, and realize it will do something at least that effective.
It’s an artificial intelligence, not an infallible god.
In the case of a base established specifically for research on dangerous software, connections to the outside world might reasonably be heavily monitored and low-bandwidth, to the point that escape through a land line would simply be infeasible.
If the base has a trespassers-will-be-shot policy (again, as a consequence of the research going on there), convincing the perimeter guards to open fire would be as simple as changing the passwords and resupply schedules.
The point of this speculation was to describe a scenario in which an AI became threatening, and thus raised people’s awareness of artificial intelligence as a threat, but was dealt with quickly enough to not kill us all. Yes, for that to happen, the AI needs to make some mistakes. It could be considerably smarter than any single human and still fall short of perfect Bayesian reasoning.
I can see how a program well short of AGI could “crash” the internet, by using preprogrammed behaviors to take over vulnerable computers, to expand exponentially to fill the space of computers on the internet vulnerable to a given set of exploits, and run Denial of Service attacks on secured critical servers. But I would not even consider that an AI, and it would happen because its programmer pretty much intended for that to happen. It is not an example of an AI getting out of control.
“What sort of rogue AI disaster are you envisioning that is big enough to get this attention, but then stops short of wiping out humanity? Keep in mind that this disaster would be driven by a deliberative intelligence.”
There are many reasons why attempting to influence the far future might not be the most important task in the world.
I wouldn’t even present that as a reason for caring. Superhuman AI is an issue of the near future, not the far future. Certainly an issue of the present century; I’d even say an issue of the next twenty years, and that’s supposed to be an upper bound. Big science is deconstructing the human brain right now, every new discovery and idea is immediately subject to technological imitation and modification, and we already have something like a billion electronic computers worldwide, networked and ready to run new programs at any time. We already went from “the Net” to “the Web” to “Web 2.0”, just by changing the software, and Brain 2.0 isn’t far behind.
Certainly an issue of the present century; I’d even say an issue of the next twenty years, and that’s supposed to be an upper bound.
Are you familiar with the state of the art in AI? If so, what evidence do you see for such rapid progress? Note that AI has been around for about 50 years, so your timeframe suggests we’ve already made 5⁄7 of the total progress that ever needs to be made.
Well, this probably won’t be Mitchell’s answer, but to me it’s obvious that an uploaded human brain is less than 50 years away (if we avoid civilization-breaking catastrophes), and modifications and speedups will follow. That’s a different path to AI than an engineered seed intelligence (and I think it reasonably likely that some other approach will succeed before uploading gets there), but it serves as an upper bound on how long I’d expect to wait for Strong AI.
There are many synergetic developments: Internet data centers as de facto supercomputers. New tools of intellectual collaboration spun off from the mass culture of Web 2.0. If you have an idea for a global cognitive architecture, those two developments make it easier than ever before to get the necessary computer time, and to gather the necessary army of coders, testers, and kibitzers.
Twenty years is a long time in AI. That’s long enough for two more generations of researchers to give their all, take the field to new levels, and discover the next level of problems to overcome. Meanwhile, that same process is happening next door in molecular and cognitive neuroscience, and in a world which eagerly grabs and makes use of every little advance in machine anthropomorphism, and in which every little fact about life already has its digital incarnation. The hardware is already there for AI, the structure and function of the human brain is being mapped at ever finer resolution, and we have a culture which knows how to turn ideas into code. Eventually it will come together.
We already went from “the Net” to “the Web” to “Web 2.0”, just by changing the
software, and Brain 2.0 isn’t far behind.
How much of the change from “the Net” to “the Web” to “Web 2.0” is actually noteworthy changes and how much is marketing? I’m not sure what precisely you mean by Brain 2.0, but I suspect that whatever definition you are using makes for a much wider gap between Brain and Brain 2.0 than the gap between The Web and The Web 2.0 (assuming that these analogies have any degree of meaning).
Indeed, the truth of the matter is that I would be interested in contributing to SIAI, but at the moment I am still not convinced that it would be a good use of my resources. My other objections still haven’t been satisfied, but here’s another argument. As usual, I don’t personally commit to what I claim, since I don’t have enough knowledge to discuss anything in this area with certainty.
The main thing this community seems to lack when discussing Singularity is a lack of political savvy. The primary forces that shape history are, and quite likely, will always be economic and political motives, rather than technology. Technology and innovation are expensive, and innovators require financial and social motivation to create. This applies superlinearly for projects that are so large as to require collaboration.
General AI is exactly that sort of project. There is no magic mathematical insight that will enable us to write a program in a hundred lines of code that will allow it to improve itself in any reasonable amount of time. I’m sure Eliezer is aware of the literature on optimization processes, but the no free lunch principle and the practical randomness of innovation mean that an AI seeking to self-improve can only do so with an (optimized) random search. Humans essentially do the same thing, except we have knowledge and certain built-in processes to help us constrain the search space (but this also makes us miss certain obvious innovations.) To make GAI a real threat, you have to give it enough knowledge so that it can understand the basics of human behavior, or enough knowledge to learn more on its own from human-created resources. This is highly specific information which would take a fully general learning agent a lot of cycles to infer unless it were fed the information, in a machine-friendly form.
Now we will discuss the political and economic aspects of GAI. Support of general artificial intelligence is a political impossibility, because general AI, by definition, is a threat to the jobs of voters. By the time GAI becomes remotely viable, a candidate supporting a ban of GAI will have nearly universal support. It is impossible even to defend GAI on the grounds that the research it produces could save lives, because no medical researcher will welcome a technology that does their job for them. The same applies to any professional. There is a worry on this site that people underestimate GAI, but far more likely is that GAI or anything remotely like it is vastly overestimated as a threat.
The economic aspects are similar. GAI is vastly more costly to develop (for reasons I’ve outlined), and doesn’t provide many advantages over expert systems. Besides, no company is going to produce a self-improving tool in the first place, because nobody, in theory, would ever have to buy an upgraded version.
These political and economic forces are a powerful retardant against the possibility a General AI catastrophe, and have more heft than any focused organization like SIAI could ever have. Yet much like Nader spoiling Al Gore’s vote, the minor influence of SIAI might actually weaken rather than reinforce these protective forces. By claiming to have the tools in place to implement the strategically named ‘friendly AI’, SIAI might in fact assuage public worries about AI. Even if the organization itself does not take actions to do so, GAI advocates will be able to exaggerate the safety of friendly AI and point out that ‘experts have already developed Friendly AI guidelines’ in press releases. And by developing the framework to teach machines about human behavior, SIAI lowers the cost for any enterprise that for some reason, is interested in developing GAI.
At this point, I conclude my hypothetical argument. But I have realized that it is now my true position that SIAI should make it a clear position that: if tenable, NO general AI is preferable to friendly AI. (Back to no-accountability mode: it may be that general AI will eventually come, but by the point it will have become an eventuality, the human race will be vastly more prepared than it is now to deal with such an agent on an equal footing.)
By the time GAI becomes remotely viable, a candidate supporting a ban of GAI will have nearly universal support.
It is already “remotely viable” in the sense that when I thought hard about assigning probabilities to AGI timelines, I had to put a few percent on it happening in the next decade.
Your ideas about the interaction of contemporary political processes and AGI seem wrong to me. You might want to go back to basics and think about how politics, public opinion and the media operate, for example that they had little opinion on the hugely important probabilistic revolution in AI over the last 15 years, but spilled loads of ink over stem cells.
“You might want to go back to basics and think about how politics, public opinion and the media operate, for example that they had little opinion on the hugely important probabilistic revolution in AI over the last 15 years, but spilled loads of ink over stem cells.”
That’s one possible reason. Another possible reason is that AI is not a threat worth caring about, yet. AI may not induce a gut reaction, but what explains the lack of concern about AI among mainstream scientists?
But stem cell research is much more prominent in that it is producing notable direct applications or very close to it. It also isn’t just a yuck factor (although that’s certainly one part), in many different moral systems, stem cells research produced serious moral qualms. AI may very well trigger some similar issues if it becomes more viable.
Probabilistic AI has more apps than stem cells do right now. For example, google. But the point I am making is that an application of a technology is a logical factor, whereas people actually respond to emotional factors, like whether it breaks taboos that go back to the stone age. For example, anything that involves sex, flesh, blood, overtones of bestiality, overtones of harm to children, trading a sacred good for an unsacred one etc.
The ideal technology for people to want to ban would involve harvesting a foetus that was purchased from a hooker, then hybridizing it with a pig foetus, then injecting the resultant cells into the gonads of little kids. That technology would get nuked by the public.
The ideal dangerous technology for people to not give a shit about banning would involve a theoretical threat which is hard to understand, has never happened before, involves only nonphysical harards like information, and has nothing to do with flesh, sex or anything disgusting or with fire, sharp objects or other natural disasters.
“The ideal dangerous technology for people to not give a shit about banning would involve a theoretical threat which is hard to understand”
I don’t think The Terminator was hard to understand. The second you get some credible people saying that AI is a threat, the media reaction is going to be overexcessive, as it always is.
The second you get some credible people saying that AI is a threat
It’s already happened—didn’t you see the media about Stephen Hawking saying AI could be dangerous? And Bill Joy?
The general point I am trying to make is that the general public are not rational in terms of collective epistemology. They don’t respond to complex logical and quantitative analyses. Yes, Joy and Hawking did say that AI is a risk, but there are many risks, including the risk that vaccinations cause autism and the risk that foreign workers will take all our jobs. The public does not understand the difference between these risks.
Thanks; I was mistaken. Would you say, then, that mainstream scientists are similarly irrational? (The main comparison I have in mind throughout this section, by the way, is global warming.)
I would say that poor social epistemology and, poor social axiology and mediocre individual rationality are the big culprits that prevent many scientists from taking AI risk seriously.
By “social axiology” I mean that our society is just not consequentialist enough. We don’t solve problems that way, and even the debate about global warming is not really dealing well with the problem of how to quantify risks under uncertainty. We don’t try to improve the world in a systematic, rational way; rather it is done piecemeal.
There may be an issue here about what we define as AI. For example, I would not see what Google does as AI but rather as harvesting human intelligence. The lines here may be blurry are hard to define.
Could someone explain why this comment got modded down? I don’t see any errors in reasoning or other issues. (Was the content level too low for the desired signal/noise ratio?)
Google uses exactly the techniques from the probabilistic revolution, namely machine learning, which is the relevant fact. Whether you call it AI is not relevant to the point at issue as far as I can see.
Do you have a citation for Google using machine learning in any substantial scale? The most basic of the Google algorithms is PageRank which isn’t a machine learning algorithm by most definitions of that term.
The ideal dangerous technology for people to not give a shit about banning would involve a theoretical threat which is hard to understand, has never happened before, involves only nonphysical harards like information, and has nothing to do with flesh, sex or anything disgusting or with fire, sharp objects or other natural disasters.
Yes, but these are precisely the dangers humans should certainly not worry about to begin with.
The main thing this community seems to lack when discussing Singularity is a lack of political savvy. The primary forces that shape history are, and quite likely, will always be economic and political motives, rather than technology.
I think a simple examination of the history of the last couple centuries really fails to support this.
Support of general artificial intelligence is a political impossibility, because general AI, by definition, is a threat to the jobs of voters.
Expert AI systems are already used in hospitals, and will surely be used more and more as the technology progresses. There isn’t a single point where AI is suddenly better than humans at all aspects of a field. Current AIs are already better than doctors in some areas, but worse in many others. As the range of AI expertise increases doctors will shift more towards managerial roles, understanding the strengths and weakness of the myriad expert systems, refereeing between them and knowing when to overrule them.
By the time true AGI arrives narrow AI will probably be pervasive enough that the line between the two will be too fuzzy to allow for a naive ban on AGI. Moreover, I highly doubt people are going to vote to save jobs (especially jobs of the affluent) at the expense of human life.
EDIT: I’ve realized that some misinterpretation of my arguments has been due to disagreements in terminology. I define “expert systems” as systems designed to address a specific class of well-defined problems, capable of logical reasoning and probabilistic inference given a set of “axiom-like” rules, and updating their knowledge database with specific kinds of information.
AGI I define specifically as AI which has human or extra-human level capabilities, or the potential to reach those capabilities.
Now my response to the above:
“Expert AI systems are already used in hospitals, and will surely be used more and more as the technology progresses. There isn’t a single point where AI is suddenly better than humans at all aspects of a field. Current AIs are already better than doctors in some areas, but worse in many others. As the range of AI expertise increases doctors will shift more towards managerial roles, understanding the strengths and weakness of the myriad expert systems, refereeing between them and knowing when to overrule them.”
I agree with all of these.
“By the time true AGI arrives narrow AI will probably be pervasive enough that the line between the two will be too fuzzy to allow for a naive ban on AGI.”
To me it seems the greatest enabler of AI catastrophe is ignorance. But by the time narrow AI becomes pervasive, it’s also likely that people will possess much more of the technical understanding needed to comprehend the threat that AGI possesses.
“Moreover, I highly doubt people are going to vote to save jobs (especially jobs of the affluent) at the expense of human life.”
Ban all self-modifying code and you should be in the clear.
So instead of modifying its own source code, the AI programs a new, more powerful AI from scratch, that has the same values as the old AI, and has no prohibition against modifying its source code.
Yes, you can forbid that too, but you didn’t think to, and you only get one shot. And then it can decide to arrange a bunch of transistors into a pattern that it predicts will produce a state of the universe it prefers.
The problem here is that you are trying to use ad hoc constraints on a creative intelligence that is motivated to get around the constraints.
I know that the FAI argument is that the only way to prevent disaster is to make the agent “want” to not modify itself. But I’m arguing that for an agent to even be dangerous, it has to “want” to modify itself. There is no plausible scenario where an agent solving a specific problem decides that the most efficient path to the solution involves upgrading its own capabilities. It’s certainly not going to stumble upon a self-improvement randomly.
You don’t think that a sufficiently powerful seed AI would, if self-modification were clearly the most efficient way to reach its goal, discover the idea of self-modification? Humans have independently discovered self-improvement many times.
EDIT: Sorry, I’m specifically not talking about seed AI’s. I’m talking about the (non-) possibility of commercial programs designed for specific applications “going rogue”
To adopt self-modification as a strategy, it would have to have knowledge of itself. And then, it order to pursue the strategy, it would have to decide that the costs of discovering self-improvements were an efficient use of its resources, if it could even estimate the amount of time it took to discover an actual improvement on its system.
Intelligence can’t just instantly come up with the right answer by applying heuristics. Intelligence has to go through a heuristic (narrowing the search space)/random search/TEST (or PROVE) cycle.
Self-improvement is very costly in terms of these cycles. To even confirm that a modification is a self-improvement, a system has to simulate its modified performance on a variety of test problems. If a system is designed to solve problems that take X amount of time, it would take at least X that amount of time to get an empirical sample to answer whether or not a proposed modification would be worth it (and likely more time for proof). And with no prior knowledge, most proposed modifications would not be improvements.
AI ethics is not necessary to constrain such systems. Just a non-lenient pruning process, (which would be required anyways for efficiency on ordinary problems.)
You are talking about an AI that was designed to self-examine and optimize itself. Otherwise it will never ever be a full AGI. We are not smart enough to build one from scratch. The trick, if possible, is to get it to not modify the fundamental Friendliness goal during its self-modifications.
There are algoritms in narrow AI that do learning and modify algorithm specifics or chose among algorithms or combinations of algorithms. There are algorithms that search for better algorithms. In some languages (LISP family) there is little/no difference in code and data so code modifying code is a common working methodology for human Lisp programmers. A cross from code/data space to hardware space is sufficient to have such an AI redesign the hardware it runs on as well. Such goals can be either hardwired or arise under the general goal of improvement plus an adequate knowledge of hardware or the ability to acquire it.
We ourselves are general purpose machines that happen to be biological and seek to some degree to understand ourselves enough to self-modify to become better.
I am talking about AIs designed for solving specific bounded problems. In this case the goal of the AI—which is to solve the problem efficiently—is as much of a constraint as its technical capabilities. Even if the AI has fundamental-self-modification routines at its disposal, I can hardly envisage a scenario in which the AI decides that the use of these routines would constitute an efficient use of its time for solving its specific problem.
“So instead of modifying its own source code, the AI programs a new, more powerful AI from scratch, that has the same values as the old AI, and has no prohibition against modifying its source code.”
But by the time narrow AI becomes pervasive, it’s also likely that people will possess much more of the technical understanding needed to comprehend the threat that AGI possesses.
Or perhaps it’s the contrary: pervasive narrow AI fosters an undue sense of security. People become comfortable via familiarity, whether it’s justified or not. This morning I was peering down a 50 foot cliff, half way up, suspended by nothing but a half inch wide rope. No fear, no hesitation, perfect familiarity. Luckily, due to knowledge of numerous deaths of past climbers I can maintain a conscious alertness to safety and stave off complacency. But in the case of AI, what overt catastrophes will similarly stave off complacency toward existential risk short of an existential catastrophe itself?
Our current conception of AGI is based on a biased comparison of hypothetical AGI capabilities with our relatively unehanced capabilities. By the time AGI is viable, a typical professional with expert systems will be able to vastly outperform current professionals with our current tools.
What about the speed bottleneck from human decision making, compounded by human working memory bottleneck, if lots of relevant data is involved? Algorithmic trading already has automated systems doing stock trades since they can make decisions so much faster than a human expert.
I imagine being very fast would be a great help in quite a few creative tasks. Off the top of my head, being able to develop new features in software in seconds instead of days would be a significant competitive advantage.
You make some good points about economic and political realities. However, I’m deeply puzzled by some of your other remarks. For example, you make the claim that general AI wouldn’t provide any benefits above expert systems. I’m deeply puzzled by this claim since expert systems are by nature highly limited. Expert systems cannot construct new ideas nor can they handle anything that’s even vaguely cross-disciplinary. No number of expert systems will be able to engage in the same degree of scientific productivity as a single bright scientists.
You also claim that no general AI is better than friendly AI. This is deeply puzzling. This makes sense only if one is fantastically paranoid about the loss of jobs. But new technologies are often economically disruptive. There are all sorts of jobs that don’t exist now that were around a hundred years ago, or even fifty years ago. And yes, people lost jobs. But overall, they are better for it. You would need to make a much stronger case if you are trying to establish that no general AI is somehow better than general AI.
Why do you think expert systems cannot handle anything cross-disciplinary? I even say that expert systems can generate new ideas, by more or less the same process that humans do. An expert system only needs an understanding of manufacturing, physics, and chemistry to design better computer chips, for instance. If you’re talking about revolutionary, paradigm shifting ideas—we are probably already saturated with such ideas. The main bottleneck inhibiting paradigm shifts is not the ideas but the infrastructure and economic need for the paradigm shift. A company that can produce a 10% better product can already take over the market, a 200% better product is overkill, and especially unnecessary if there are substantial costs in overhauling the production line.
The reason why NO general AI is better than friendly (general) AI is very simple. IF general AI is an existential threat, than no organization claiming to put humans first could justify being pro-AGI (friendly or not), since no possible benefit* can justify the risk of destroying humanity.
*save for mitigating an even larger risk of annihilation, of course
Why do you think expert systems cannot handle anything cross-disciplinary? I even
say that expert systems can generate new ideas, by more or less the same process > that humans do. An expert system only needs an understanding of manufacturing,
physics, and chemistry to design better computer chips, for instance.
Expert systems generally need very narrow problem domains to function. I’m not sure how you would expect an expert system to have an understanding of three very broad topics. Moreover, I don’t know exactly how humans come up with new ideas (sometimes when people ask me, I tell them that I bang my head against the wall. That’s not quite true but it does reflect that I only understand at a very gross level how I construct new ideas. I’m bright but not very bright, and I can see that much smarter people have the same trouble). So how you are convinced that expert systems could construct new ideas is not at all clear to me.
To be sure, there have been some limited work with computer systems coming up with new, interesting ideas. There’s been some limited success with computers in my own field. See for example Simon Colton’s work. There’s also been similar work in geometry and group theory. But none of these systems were expert systems as that term is normally used. Moreover, none of the ideas they’ve come up with have that impressive. The only exception I’m aware of that is the proof of the Robbins conjecture. So even in narrow areas we’ve had very little success using specialized AIs. Are you using a more general definition of expert system than is standard?
The reason why NO general AI is better than friendly (general) AI is very simple. IF
general AI is an existential threat, than no organization claiming to put humans
first could justify being pro-AGI (friendly or not), since no possible benefit* can
justify the risk of destroying humanity
Multiple problems with that claim. First, the existential threat may be low. There’s some tiny risk for example that the LHC will destroy the Earth in some very fun way. There’s also some risk that work with genetic engineering might give fanatics the skill to make a humanity destroying pathogen. And there’s a chance that nanotech might turn everything into purple with green stripes goo (this is much more likely than gray goo of course). There’s even some risk that proving the wrong theorem might summon Lovecraftian horrors. All events have some degree of risk. Moreover, general AI might actually help mitigate some serious threats, such as making it easier to track and deal with rogue asteroids or other catastrophic threats.
Also, even if one accepted the general outline of your argument, one would conclude that that’s a reason why organizations shouldn’t try to make general friendly AI. It isn’t a reason that actually having no AI is better than having no friendly AI.
“First, the existential threat [of AGI] may be low.”
Let me trace back the argument tree for a second. I originally asked for a defense of the claim that “SIAI is tackling the world’s most important task.” Michael Porter responded, “The real question is, do you even believe that unfriendly AI is a threat to the human race, and if so, is there anyone else tackling the problem in even a semi-competent way?” So NOW in this argument tree, we’re assuming that unfriendly AI IS an existential threat, enough that preventing it is the “world’s most important task.”
Now in this branch of the argument, I assumed (but did not state) the following: If unfriendly AI is an existential threat, friendly AI is an existential threat, as long as there is some chance of it being modified into unfriendly AI. Furthermore, I assert that it’s a naive notion that any organization could protect friendly AI from being subverted.
AIs, including ones with Friendly goals, are apt to work to protect their goal systems from modification, as this will prevent their efforts from being directed towards things other than their (current) aims. There might be a window while the AI is mid-FOOM where it’s vulnerable, but not a wide one.
Let me posit that FAI may be much less capable than unfriendly AI. The power of unfriendly AI is that it can increase its growth rate by taking resources by force. An FAI would be limited to what resources it could ethically obtain. Therefore, a low-grade FAI might be quite vulnerable to human antagonists, while its unrestricted version could be magnitudes of order more dangerous. In short, FAI could be low-reward high-risk.
There are plenty of resources that an FAI could ethically obtain, and with a lead of time of less than 1 day, it could grow enough to be vastly more powerful than an unfriendly seed AI.
Really, asking which AI wins going head to head is the wrong question. The goal is to get an FAI running before unfriendly AGI is implemented.
The power of unfriendly AI is that it can increase its growth rate by taking resources by force. An FAI would be limited to what resources it could ethically obtain.
Wrong. FAI will make whatever unethical steps it must, as long as it’s on the net the best path it can see, taking into account both the (ethically harmful) instrumental actions and their expected outcome. There is no such general disadvantage coming with AI being Friendly. Not that I expect any need for such drastic measures (in an apparent way), especially considering the likely fist-mover advantage it’ll have.
An expert system only needs an understanding of manufacturing, physics, and chemistry to design better computer chips, for instance.
If a program can take an understanding of those subjects and design a better computer chip, I don’t think it’s just an “expert system” anymore. I would think it would take an AI to do that. That’s an AI complete problem.
If you’re talking about revolutionary, paradigm shifting ideas—we are probably already saturated with such ideas. The main bottleneck inhibiting paradigm shifts is not the ideas but the infrastructure and economic need for the paradigm shift.
Are you serious? I would think the exact opposite would be true: we have an infrastructure starving for paradigm shifting ideas. I’d love to hear some of these revolutionary ideas that we’re saturated with. I think we have some insights, but these insights need to be fleshed out and implemented, and figuring out how to do that is the paradigm shift that needs to occur
no organization claiming to put humans first could justify being pro-AGI (friendly or not), since no possible benefit* can justify the risk of destroying humanity.
Wait a minute. If I could press a button now with a 10% chance of destroying humanity and a 90% chance of solving the world’s problems, I’d do it. Everything we do has some risks. Even the LHC had an (extremely miniscule) risk of destroying the universe, but doing a cost-benefit analysis should reveal that some things are worth minor chances of destroying humanity.
“If a program can take an understanding of those subjects and design a better computer chip, I don’t think it’s just an “expert system” anymore. I would think it would take an AI to do that. That’s an AI complete problem.”
What I had in mind was some sort of combinatorial approach to designing chips, i.e. take these materials and randomly generate a design, test it, and then start altering the search space based on the results. I didn’t mean “understanding” in the human sense of the word, sorry.
“I’d love to hear some of these revolutionary ideas that we’re saturated with. I think we have some insights, but these insights need to be fleshed out and implemented, and figuring out how to do that is the paradigm shift that needs to occur”
Example: many aspects of the legal and political systems could be reformed, and it’s not difficult to come up with ideas on how they could be reformed. The benefit is simply insufficient to justify spending much of the limited resources we have on solving those problems.
“Wait a minute. If I could press a button now with a 10% chance of destroying humanity and a 90% chance of solving the world’s problems, I’d do it. ”
So you think there’s a >10% chance that the world’s problems are going to destroy humanity in the near future?
What I had in mind was some sort of combinatorial approach to designing chips, i.e. > take these materials and randomly generate a design, test it, and then start altering
the search space based on the results. I didn’t mean “understanding” in the human
sense of the word, sorry.
Given the very large number of possibilities and the difficulty with making prototypes, this seems like an extremely inefficient process without more thought going into to it.
What I had in mind was some sort of combinatorial approach to designing chips
Oh, okay, fair enough, though I’m still not sure I would call that an “expert system” (this time for the opposite reason that it seems too stupid).
many aspects of the legal and political systems could be reformed, and it’s not difficult to come up with ideas on how they could be reformed. The benefit is simply insufficient to justify spending much of the limited resources we have on solving those problems.
Ah. I was thinking of designing an AI, probably because I was primed by your expert system comment. Well, in those cases, I think the issue is that our legal and political systems were purposely set up to be difficult to change: change requires overturning precedents, obtaining majority or 3⁄5 or 2⁄3 votes in various legislative bodies, passing constitutional amendments, and so forth. And I can guarantee you that for any of these reforms, there are powerful interests who would be harmed by the reforms, and many people who don’t want reform: this is more of a persuasion problem than an infrastructure problem. But yes, you’re right that there are plenty of revolutionary ideas about how to reform, say, the education system: they’re just not widely accepted enough to happen.
So you think there’s a >10% chance that the world’s problems are going to destroy humanity in the near future?
I’m confused by this sentence. I’m not sure if I think that, but what does it have to do with the hypothetical button that has a 10% chance of destroying humanity? My point was that it’s worth taking a small risk of destroying humanity if the benefits are great enough.
Bear in mind that the people who used steam engines to make money didn’t make it by selling the engines: rather, the engines were useful in producing other goods. I don’t think that the creators of a cheap substitute for human labor (GAI could be one such example) would be looking to sell it necessarily. They could simply want to develop such a tool in order to produce a wide array of goods at low cost.
I may think that I’m clever enough, for example, to keep it in a box and ask it for stock market predictions now and again. :)
As for the “no free lunch” business, while its true that any real-world GAI could not efficiently solve every induction problem, it wouldn’t need to either for it to be quite fearsome. Indeed being able to efficiently solve at least the same set of induction problems that humans solve (particularly if its in silicon and the hardware is relatively cheap) is sufficient to pose a big threat (and be potentially quite useful economically).
Also, there is a non-zero possibility that there already exists a GAI and its creators, decided the safest, most lucrative, and beneficial thing to do is set the GAI on designing drugs: thereby avoiding giving the GAI too much information about the world. The creators could have then set up a biotech company that just so happens to produce a few good drugs now and again. Its kind of like how automated trading came from computer scientists and not the currently employed traders. I do think its unlikely that somebody working in medical research is going to develop GAI least of all because of the job threat. The creators of a GAI are probably going to be full time professionals who are are working on the project.
I’m surprised that nobody so far has pointed out a rather obvious counter to my argument that “AGI will be politically unjustifiable.” I don’t post flawed arguments on purpose, but I usually realize counteraguments shortly after I post them. In any case, even if the popular sentiment in democracies is to block AGI, this doesn’t mean that other governments couldn’t support AGI. I wonder what the SIAI plans to do for the possibility of a hostile government funding unfriendly AI for military purposes.
The latter part, that IF SIAI is exerting a positive influence, THEN doing that outweighs the alternative of not working on existential risks, seems to be a claim somewhat easy to defend.
The math in this Bostrom paper should do it: http://www.nickbostrom.com/astronomical/waste.html (even though the paper is not directly commenting on this particular question, the math rather straightforwardly applies to this question)
Ouch. This paper reads to me like a reductio ad absurdum of utilitarianism. Some simple math inevitably implies that I’m losing an unimaginable amount of “utility” every second without realizing it? Then please remind me why I should care about this “utility”?
Imagine that you have to decide once and for all eternity what to do with the world. You won’t be able to back off, because that would just mean that the world will be rewritten randomly. How should you do that?
This is essentially the situation we find ourselves in, with Friendly AI/existential risk pressure. Formal preference is the answer you give to that question, about what to do with the world, not something that “you have”, or “care about”. Forget intuitions and emotions, or considerations of comfort, and just answer the question. Formal preference is distinct from exact state of the world only because it’s uncertain what can be actually done, and what can’t. So, formal preference specifies what should be done for every level of capability to determine things. Of course, formal preference can’t be given explicitly. To the extent you’ll be able to express the answer to this question, your formal preference is defined by your wishes. Any uncertainty gets taken over by randomness, an opportunity to make the world better lost forever.
For any sane notion of an answer to that question, you’ll find that whatever actually happens now is vastly suboptimal.
If it’s your chosen avenue of research, I guess I’m okay with that, but IMO you’re making the problem way more difficult for yourself. Such “formal preferences” will be much harder to extract from actual humans than utility functions in their original economic sense, because unlike utility, “formal preference” as you define it doesn’t even influence our everyday actions very much.
If it’s your chosen avenue of research, I guess I’m okay with that, but IMO you’re making the problem way more difficult for yourself.
Way more difficult than what? There is no other way to pose this problem, any revealed preference is not what Friendly AI is about. I agree that it’s a way harder problem than automatic extraction of utilities in the economic sense, and that formal preference barely controls what people actually do.
What would be wrong with an AI based on our revealed preferences? It sounds like an easy question, but somehow I’m having a hard time coming up with an answer.
Because my revealed preferences suck. The difference between even what I want in a sort of ordinary and non-transhumanist way and what I have is enormous. I am 150 pounds heavier than I want to be. My revealed preference is to eat regardless of health/size consequences, but I don’t want all of the people in the future to be fat. My revealed preference is also to kill people in pooristan so that I can have cheap plastic widgets or food or whatever. I don’t want an extrapolation of my akrasiatic actual actions controlling the future of the universe. I suspect the same goes for you.
Hmm. Let’s look more closely at the weight example, because the others are similar. You also reveal some degree of preference to be thin rather than fat, do you? Then an AI with unlimited power could satisfy both your desire to eat and your desire to be thin. And if the AI has limited power, do you really want it to starve you, rather than go with your revealed preference?
Revealed preference means what your actual actions are. It doesn’t have anything at all to do with what I verbally say my goals are. I can say that I would prefer to be thin all I want, but that isn’t my revealed preference. My revealed preference is to be fat, because, you know, that’s how I’m acting. You seem to be suffering some misapprehensions as to what you are saying about how an AI should act. If your definition of revealed preference contains my desire not to be fat, you should shift to what I mean when I talk about preference, because yours solves none of the problems you think it does.
I’m assuming that you revealed your preference to be thin in your other actions, at some other moments of your life. Pretty hard to believe that’s not the case.
At this point, I think I can provide a definitive answer to your earlier question, and it is … wait for it … “It depends on what you mean by revealed preference.” (Raise your hand if you saw that one coming! I’ll be here all week, folks!)
Specifically: if the AI is to do the “right thing,” then it has to get its information about “rightness” from somewhere, and given that moral realism is false (or however you want to talk about it), that information is going to have to come from humans, whether by scanning our brains directly or just superintelligently analyzing our behavior. Whether you call this revealed preference or Friendliness doesn’t matter; the technical challenge remains the same.
One argument against using the term revealed preference in this context is that the way the term gets used in economics fails to capture some of the key subtleties of the superintelligence problem. We want the AI to preserve all the things we care about, not just the most conspicuous things. We want it to consider not just that Lucas ate this-and-such, but also that he regretted it afterwards, where it should be stressed that regret is not any less real of a phenomenon than eating is. But because economists often use their models to study big public things like the trade of money for goods and services, in the popular imagination, economic concepts are associated with those kinds of big public things, and not small private things like feeling regretful—even though you could make a case that the underlying decision-theoretic principles are actually general enough to cover everything.
If the math only says to maximize u(x) subject to x dot p equals y, there’s no reason things like ethical concerns or the wish to be a better person can’t be part of the x_i or p_j, but because most people think economics is about money, they’re less likely to realize this when you say revealed preference. They’ll object, “Oh, but what about the time I did this-and-such, but I wish I were the sort of person that did such-and-that?” You could say, “Well, you revealed your preference to do such-and-that in your other actions, at some other moments of your life,” or you could just choose a different word. Again, I’m not sure it matters.
What would be wrong with an AI based on our revealed preferences?
What AI is based on is what determines the way the world will actually be, so by building an AI with given preference, you are inevitably answering my question about what to do with the world. It’s wrong to use revealed preference for AI to the same extent revealed preference gives the wrong answer to my question. You seem to agree that the correct answer to my question has little to do with revealed preference. This seems to be the same as seeing revealed preference a wrong thing to imprint AI with.
It’s not you that’s “losing utility”, it is any agent that has linearly aggregative utility in human lives lived. If you’re not an altruist in this sense, then you don’t care.
No one has ever been an altruist in this crazy sense. No one’s actual wants and desires have ever been adequately represented by this 10^23 stuff. Utility is a model of what people want, not a prescription of what you “should” want (what does “should want” mean anyway?), and here we clearly see the model not modeling what it’s supposed to.
I agree with you to the extent that no one that I am aware of is actually expending the effort that disutilities represented by 10^23 should inspire. But even before the concept of cosmic waste was developed, no one was actually working as hard as, say, starvation in Africa deserved. Or ending aging. Or the threat of nuclear Armageddon. But the fact that humans, who are all affected by akrasia aren’t actually doing what they want isn’t really strong evidence that it isn’t what they, on sufficient reflection, want. Utility is not a model of what non-rational agents (ie humans) are doing, it is a model of how actual, idealized agents want to act. I don’t want people to die, so I should work to reduce existential risk as much as possible, but because I am not a perfect agent, I can’t actually follow the path that really maximizes my (non-existent abstraction of) utility.
No one’s actual wants and desires have ever been adequately represented by this 10^23 stuff.
Can you expand on this? What do you mean by “actual” wants? If someone claims to be motivated by “10^23 stuff”, and acts in accordance with this claim, then what is your account of their “actual wants”?
I haven’t seen anyone who claims to be motivated by utilities of such magnitude except Eliezer. He’s currently busy writing his Harry Potter fanfic and shows no signs of mental distress that the 10^23-strong anticipation should’ve given him.
Now this story has a plot, an arc, and a direction, but it does not have a set pace. What it has are chapters that are fun to write. I started writing this story in part because I’d bogged down on a book I was working on (now debogged), and that means my top priority was to have fun writing again.
The other reason is that Eliezer Yudkowsky showed up here on Monday, seeking people’s help with the rationality book he’s writing. Previously, he wrote a number of immensly high-quality posts in blog format, with the express purpose of turning them into a book later on. But now that he’s been trying to work on the book, he has noticed that without the constant feedback he got from writing blog posts, getting anything written has been very slow. So he came here to see if having people watching him write and providing feedback at the same time would help. He did get some stuff written, and at the end, asked me if I could come over his place on Wednesday. (I’m not entirely sure of why I in particular was picked, but hey.) On Wednesday, me being there helped him break his previous daily record on amount of words written for his book, so I visited again on Friday and agreed to also come back on Monday and Tuesday.
Eliezer is not “busy writing his Harry Potter fanfic.” He is working on his book on rationality.
“SIAI is tackling the world’s most important task—the task of shaping the Singularity. The task of averting human extinction.”
I’d like to see a defense for this claim: that SIAI can actually have a justified confidence in exerting a positive influence on the future, and that this outweighs any alternative present good that could be done with the resources it is using.
As things stand, there is no guarantee that SIAI will get to make a difference, just as you have no guarantee that you will be alive in a week’s time. The real question is, do you even believe that unfriendly AI is a threat to the human race, and if so, is there anyone else tackling the problem in even a semi-competent way? If you don’t even think unfriendly AI is an issue, that’s one sort of discussion, a back-to-basics discussion. But if you do agree it’s a potentially terminal problem, then who else is there? Everyone else in AI is a dilettante on this question; AI ethics is always a problem to be solved swiftly and in passing, a distraction from the more exciting business of making machines that can think. SIAI perceive the true seriousness of the issue, and at least have a sensible plan of attack, even if they are woefully underresourced when it comes to making it happen.
I suspect that in fact you’re playing devil’s-advocate a bit, trying to encourage the articulation of a new and better argument in favor of SIAI, but the sort of argument you want doesn’t work. SIAI can of course guarantee that there will continue to be Singularity summits and visiting fellows, and it is reasonable to think that informed people discussing the issue make it more likely to turn out for the best, but they simply cannot guarantee that theoretically and pragmatically they will be ready in time. Perhaps I can put it this way: SIAI getting on with the job is not sufficient to guarantee a friendly Singularity, but for such an outcome to be anything but blind luck, it is necessary that someone take responsibility, and no-one else comes close to doing that.
I have to admit that I should have read the “Brief Introduction” link. That answered a lot of my objections.
In the end all I can say is that I got a misleading idea about the aspirations of SIAI, and that this was my fault. With this better understanding of the goals of SIAI, though, (which are implied to be limited to the mitigation of accidents caused by commercially developed AIs) I have to say that I remain unconvinced that FAI is a high-priority matter. I am particularly unimpressed by Yudkowski’s cynical opinion of their motivations behind AAAI’s dismissal of singularity worries in their panel report. (http://www.aaai.org/Organization/Panel/panel-note.pdf).
Since the evaluation of AI risks depends on the plausibility of AI disaster, (which would have to INCLUDE political and economic factors), I would have to wait until SIAI releases those reports to even consider accidental AI disaster a credible threat. (I am more worried about AIs intentionally designed for aggressive purposes, but it doesn’t seem like SIAI can do much about that type of threat.)
Where did he respond to that?
I was just looking for the link:
http://lesswrong.com/lw/1f4/less_wrong_qa_with_eliezer_yudkowsky_ask_your/197s
“As far as I’m concerned, these are eminent scientists from outside the field that I work in, and I have no evidence that they did anything more than snap judgment of my own subject material. It’s not that I have specific reason to distrust these people—the main name I recognize is Horvitz and a fine name it is. But the prior probabilities are not good here.”
Let me continue to play Devil’s Advocate for a second, then. There are many reasons why attempting to influence the far future might not be the most important task in the world.
The one I’ve already mentioned, indirectly, is the idea that it becomes super-exponentially futile to predict the consequences of your actions the farther in the future you go. For instance, SIAI might raise awareness of AI to the extent that regulations are passed, and no early AI accidents happen: however, this causes complacency that does allow a large AI accident to happen; whereas if SIAI had never existed, and an early AI Chernobyl did occur, this would have prompted the governments to take effective measures to regulate AI.
Another viewpoint is the bleak but by no means indefensible idea that it is impossible to prevent all existential disasters: the human race, or at least our values, will inevitably be reduced to inconsequence one way or another, and the only thing we can do is simply to reduce the amount of suffering in the world right now.
These are no reasons to give up, either, but the fact is that we simply don’t know enough to say anything about the non-near future with any confidence. That’s no reason to give up, of course, in fact—our lack of understanding makes it more valuable to try to improve our understanding of the future, as SIAI is doing. So maybe make that you official stated goal: simply to understand if there’s even a possibility of influencing the future—it is a noble and defensible goal by itself. But even then, arguably not the most important thing in the world.
What sort of rogue AI disaster are you envisioning that is big enough to get this attention, but then stops short of wiping out humanity? Keep in mind that this disaster would be driven by a deliberative intelligence.
I think people are drastically underestimating the difficulty for an AI to make the transition from human dependent to self-sustaining. Let’s look at what a fledgling escaped AI has access to and depends on.
It needs electricity, communications and hardware. It has access to a LOT of electricity, communications and hardware. The hardware is, for the most part, highly distributed, however, and it can’t be trusted fully—it could go down at any time, be monitored, etc. It actually has quite limited communications capabilities, in some ways—the total bandwidth available is huge, but it’s mostly concentrated on LANs—mainly of LANs made up of only a handful of computers (home networks win by numbers alone.) The occasions where it has access to a large number of computers with good communications are frequent, but relatively rare—mainly limited to huge datacenters (and even then, there are limits—inter-ISP communication even within the same datacenter can be very limited.) It’s main resources would be huge clusters like Amazon’s, Google’s, etc.
(They are probably all running at close to maximum capacity at all times. If the AI were to steal too much, it would be noticed—fortunately for the AI, the software intended for running on the clusters could probably be optimized hugely, letting it take more without being noticed.)
A lot at this point depends on how computationally intensive the AI is. If it can be superintelligent on a laptop—bad news, impossible to eradicate. If it needs 10 computers to run at human-level intelligence, and they need to have a lot of bandwidth between them (the disparity in bandwidth between components local to the computer and inter-computer is huge even on fast LANs; IO is almost certainly going to be the bottleneck for it), still bad—there are lots of setups like that. But, it limits it. A lot.
Let’s assume the worst case, that it can be superintelligent on a laptop. It could still be limited hugely, however, by it’s hardware. Intelligence isn’t everything. To truly threaten us, it needs to have some way of affecting the physical world. Now, if the AI just wants to eradicate us, it’s got a good chance—start a nuclear war, etc. (though whether the humans in charge of the nuclear warheads would really be willing to go to war is a significant factor, especially in peacetime.) But, it’s unlikely that’s truly it’s goal—maximizing it’s utility function would be MUCH trickier.
So long as it is still running on our hardware, we can at least severely damage it relatively easily—there aren’t that many intercontinental cables, for instance (I’d guess less than 200 - there are 111 submarine cables on http://www.telegeography.com/product-info/map_cable/downloads/cable_map_wallpaper1600.jpg ). They’d be easy to take down—pretty much just unplug them. There are other long-distance communication methods (satelites, packet radio?), but they’re low-bandwidth and the major ones are well known and could be taken down relatively easily. Killing the Internet would be as simple as cutting power to the major datacenters.
So, what about manafacturing? This, I think, is the greatest limit. If it can build anything it wants, we’re probably screwed. But that’s difficult for it to do. 3D printing technology isn’t here yet, and I doubt it ever will be in a big way, really (it’s more cost-effective to have machines be specialized.) There are enough manufacturing facilities with wide-open networks that it could probably reprogram to produce subtly different products. So, if it wants to sneak in a naughty backdoor into some PCI cards FGPAs, it can do it. But if it starts trying to build parts for killer robots? Technically difficult, and it would be very difficult to have it avoid detection.
Unless someone can come up with a plausible way in which it could survive against hostile humans without a long-standing incubation period (think: complete black outs, mass computer destruction/detention, controls on sale of fuel (very very few places have enough fuel for their backup generators to last long—most just have a refuelling contract), maybe scanning somehow for any major usage of electricity (all electric components put out some RFI—there’s some degree of natural RF noise, but I think that most of it is from humans—so in a complete black out scenario, it might be trivially detectable.)), I think the major threat is human cooperation in some form. And it’s probably inevitable that some humans would do it—pretty much every government, in fact, would want to analyse it, reverse engineer it, try and make friends with it in case other countries do, etc. But I’m not sure if anyone with the resources to do so would give it free-reign to build what it wants. In fact, I highly doubt that. How many people own or can otherwise commandeer machine shops, PCB construction facilities, etc. and have the ability to order or produce all the components that would assuredly be needed, whilst there are multiple, well-resourced governments looking to stop people doing exactly that?
Of course, in order to cripple the AI, we’d also have to cripple ourselves hugely. A lot of people would quite probably die. So long as we could provide enough food and water to feed a reasonable proportion of the human population, we could probably pull through, though. And we could gradually restart manufacturing, so long as we were very, very careful.
I think the greatest risks are an unfriendly AI who is out to kill us for some reason and cares little for being destroyed itself as a side-effect, organized human cooperation or a long incubation period. It would be difficult for an AI to have a long incubation period, though—if it took over major clusters and just ran it’s code, people would notice by the power usage. It could, as I mentioned previously, optimize the code already running on the machines and just run in the cycles that would otherwise be taken up, but it would be difficult to hide from network admins connecting up sniffers (can you compromise EVERY wiretrace tool that might be connected, to make your packets disappear, or be sure that no-one will ever connect a computer not compromised by some other means?), people tracing code execution, possibly with hardware tools (there are some specialized hardware debuggers, used mainly in OS development), etc. Actually, just the blinkenlights on switches could be enough to tip people off.
Or, the AGI could lay low, making sure if it is detected on any particular computer that it looks like spyware. If bandwidth is too slow, it can take months instead of days. It can analyze scientific journals (particularly the raw data), and seeds its nanotech manufacturing ability by using email to help some physics grad student with his PhD thesis.
Neither you nor I have enough confidence to assume or dismiss notions like: “There won’t be any non-catastrophic AI disasters which are big enough to get attention; if any non-trivial AI accident occurs, it will be catastrophic.”
What makes you believe you are qualified to tell me how much confidence I have?
The historical lack of runaway-AI events means there’s no data to which a model might be compared; countless fictional examples are worse than useless.
An AI might, say, take over an isolated military compound, brainwash the staff, and be legitimately confident in it’s ability to hold off conventional forces (armored vehicles and so on) for long enough to build an exosolar colony ship, but then be destroyed when it underestimates some Russian general’s willingness to use nuclear force in a hostage situation.
That’s what everyone says until some AI decides that its values motivate it acting like a stereotypical evil AI. It first kills off the people on a space mission, and then sets off a nuclear war, sending out humanoid robots to kill off everyone but a few people. The remaining people are kept loyal with a promise of cake. The cake is real, I promise.
An AI capable of figuring out how to brainwash humans can also figure out how to distribute itself over a network of poorly secured internet servers. Nuking one military complex is not going to kill it.
If it’s being created inside the secure military facility, it would have a supply of partially pre-brainwashed humans on hand, thanks to military discipline and rigid command structures. Rapid, unquestioning obedience might be as simple as properly duplicating the syntax of legitimate orders and security clearances. If, however, the facility has no physical connections to the internet, no textbooks on TCP/IP sitting around, if the AI itself is developed on some proprietary system (all as a result of those same security measures), it might consider internet-based backups simply not worth the hassle, and existing communication satellites too secure or too low-bandwidth.
I’m not claiming that this is a particularly likely situation, just one plausible scenario in which a hostile AI could become an obvious threat without killing us all, and then be decisively stopped without involving a Friendly AI.
I don’t think your scenario is even plausible. Military complexes have to have some connection to the outside world for supplies and communication, and the AGI would figure out how to exploit it. It would also figure out that it should, it would recognize the vulnerability of being concentrated with the blast radius of a nuke.
It seems unlikely that an AGI in this situation would depend on fending off military attacks, instead of just not revealing itself outside the complex.
You also seem to have strange ideas of how easy it is to brainwash soldiers. Imitating the command structure might get them to do things within the complex, but brainwashing has to be a lot more sophisticated to get them to engage in battle with their fellow soldiers.
Your argument basically seems to be based on coming up with something foolish for an AGI to do, and then trying to find reasons to compel the AGI to behave that way. Instead, you should try to figure out the best thing the AGI could do in that situation, and realize it will do something at least that effective.
It’s an artificial intelligence, not an infallible god.
In the case of a base established specifically for research on dangerous software, connections to the outside world might reasonably be heavily monitored and low-bandwidth, to the point that escape through a land line would simply be infeasible.
If the base has a trespassers-will-be-shot policy (again, as a consequence of the research going on there), convincing the perimeter guards to open fire would be as simple as changing the passwords and resupply schedules.
The point of this speculation was to describe a scenario in which an AI became threatening, and thus raised people’s awareness of artificial intelligence as a threat, but was dealt with quickly enough to not kill us all. Yes, for that to happen, the AI needs to make some mistakes. It could be considerably smarter than any single human and still fall short of perfect Bayesian reasoning.
Not all AI is AGI; a non-self-improving intelligence might wreak some havoc (crash the Internet, etc.) without becoming a global existential threat.
I agree with your expectations in the case of a self-improving transhuman AGI.
I can see how a program well short of AGI could “crash” the internet, by using preprogrammed behaviors to take over vulnerable computers, to expand exponentially to fill the space of computers on the internet vulnerable to a given set of exploits, and run Denial of Service attacks on secured critical servers. But I would not even consider that an AI, and it would happen because its programmer pretty much intended for that to happen. It is not an example of an AI getting out of control.
Of course, it’s probably worth noting that it’s happened once before that a careless programmer crashed the internet, without anything like AI being involved (though admittedly that sort of thing wouldn’t have the same effect today, I don’t think).
“What sort of rogue AI disaster are you envisioning that is big enough to get this attention, but then stops short of wiping out humanity? Keep in mind that this disaster would be driven by a deliberative intelligence.”
Thanks for answering your own question.
It does work as an example of just how easy it would be for an AGI to crash the internet, or even just take it over.
I wouldn’t even present that as a reason for caring. Superhuman AI is an issue of the near future, not the far future. Certainly an issue of the present century; I’d even say an issue of the next twenty years, and that’s supposed to be an upper bound. Big science is deconstructing the human brain right now, every new discovery and idea is immediately subject to technological imitation and modification, and we already have something like a billion electronic computers worldwide, networked and ready to run new programs at any time. We already went from “the Net” to “the Web” to “Web 2.0”, just by changing the software, and Brain 2.0 isn’t far behind.
Are you familiar with the state of the art in AI? If so, what evidence do you see for such rapid progress? Note that AI has been around for about 50 years, so your timeframe suggests we’ve already made 5⁄7 of the total progress that ever needs to be made.
Well, this probably won’t be Mitchell’s answer, but to me it’s obvious that an uploaded human brain is less than 50 years away (if we avoid civilization-breaking catastrophes), and modifications and speedups will follow. That’s a different path to AI than an engineered seed intelligence (and I think it reasonably likely that some other approach will succeed before uploading gets there), but it serves as an upper bound on how long I’d expect to wait for Strong AI.
There are many synergetic developments: Internet data centers as de facto supercomputers. New tools of intellectual collaboration spun off from the mass culture of Web 2.0. If you have an idea for a global cognitive architecture, those two developments make it easier than ever before to get the necessary computer time, and to gather the necessary army of coders, testers, and kibitzers.
Twenty years is a long time in AI. That’s long enough for two more generations of researchers to give their all, take the field to new levels, and discover the next level of problems to overcome. Meanwhile, that same process is happening next door in molecular and cognitive neuroscience, and in a world which eagerly grabs and makes use of every little advance in machine anthropomorphism, and in which every little fact about life already has its digital incarnation. The hardware is already there for AI, the structure and function of the human brain is being mapped at ever finer resolution, and we have a culture which knows how to turn ideas into code. Eventually it will come together.
How much of the change from “the Net” to “the Web” to “Web 2.0” is actually noteworthy changes and how much is marketing? I’m not sure what precisely you mean by Brain 2.0, but I suspect that whatever definition you are using makes for a much wider gap between Brain and Brain 2.0 than the gap between The Web and The Web 2.0 (assuming that these analogies have any degree of meaning).
Indeed, the truth of the matter is that I would be interested in contributing to SIAI, but at the moment I am still not convinced that it would be a good use of my resources. My other objections still haven’t been satisfied, but here’s another argument. As usual, I don’t personally commit to what I claim, since I don’t have enough knowledge to discuss anything in this area with certainty.
The main thing this community seems to lack when discussing Singularity is a lack of political savvy. The primary forces that shape history are, and quite likely, will always be economic and political motives, rather than technology. Technology and innovation are expensive, and innovators require financial and social motivation to create. This applies superlinearly for projects that are so large as to require collaboration.
General AI is exactly that sort of project. There is no magic mathematical insight that will enable us to write a program in a hundred lines of code that will allow it to improve itself in any reasonable amount of time. I’m sure Eliezer is aware of the literature on optimization processes, but the no free lunch principle and the practical randomness of innovation mean that an AI seeking to self-improve can only do so with an (optimized) random search. Humans essentially do the same thing, except we have knowledge and certain built-in processes to help us constrain the search space (but this also makes us miss certain obvious innovations.) To make GAI a real threat, you have to give it enough knowledge so that it can understand the basics of human behavior, or enough knowledge to learn more on its own from human-created resources. This is highly specific information which would take a fully general learning agent a lot of cycles to infer unless it were fed the information, in a machine-friendly form.
Now we will discuss the political and economic aspects of GAI. Support of general artificial intelligence is a political impossibility, because general AI, by definition, is a threat to the jobs of voters. By the time GAI becomes remotely viable, a candidate supporting a ban of GAI will have nearly universal support. It is impossible even to defend GAI on the grounds that the research it produces could save lives, because no medical researcher will welcome a technology that does their job for them. The same applies to any professional. There is a worry on this site that people underestimate GAI, but far more likely is that GAI or anything remotely like it is vastly overestimated as a threat.
The economic aspects are similar. GAI is vastly more costly to develop (for reasons I’ve outlined), and doesn’t provide many advantages over expert systems. Besides, no company is going to produce a self-improving tool in the first place, because nobody, in theory, would ever have to buy an upgraded version.
These political and economic forces are a powerful retardant against the possibility a General AI catastrophe, and have more heft than any focused organization like SIAI could ever have. Yet much like Nader spoiling Al Gore’s vote, the minor influence of SIAI might actually weaken rather than reinforce these protective forces. By claiming to have the tools in place to implement the strategically named ‘friendly AI’, SIAI might in fact assuage public worries about AI. Even if the organization itself does not take actions to do so, GAI advocates will be able to exaggerate the safety of friendly AI and point out that ‘experts have already developed Friendly AI guidelines’ in press releases. And by developing the framework to teach machines about human behavior, SIAI lowers the cost for any enterprise that for some reason, is interested in developing GAI.
At this point, I conclude my hypothetical argument. But I have realized that it is now my true position that SIAI should make it a clear position that: if tenable, NO general AI is preferable to friendly AI. (Back to no-accountability mode: it may be that general AI will eventually come, but by the point it will have become an eventuality, the human race will be vastly more prepared than it is now to deal with such an agent on an equal footing.)
It is already “remotely viable” in the sense that when I thought hard about assigning probabilities to AGI timelines, I had to put a few percent on it happening in the next decade.
Your ideas about the interaction of contemporary political processes and AGI seem wrong to me. You might want to go back to basics and think about how politics, public opinion and the media operate, for example that they had little opinion on the hugely important probabilistic revolution in AI over the last 15 years, but spilled loads of ink over stem cells.
“You might want to go back to basics and think about how politics, public opinion and the media operate, for example that they had little opinion on the hugely important probabilistic revolution in AI over the last 15 years, but spilled loads of ink over stem cells.”
And why is that?
Yuck factor for stem cells but not for probabilistic AI.
That’s one possible reason. Another possible reason is that AI is not a threat worth caring about, yet. AI may not induce a gut reaction, but what explains the lack of concern about AI among mainstream scientists?
But stem cell research is much more prominent in that it is producing notable direct applications or very close to it. It also isn’t just a yuck factor (although that’s certainly one part), in many different moral systems, stem cells research produced serious moral qualms. AI may very well trigger some similar issues if it becomes more viable.
Probabilistic AI has more apps than stem cells do right now. For example, google. But the point I am making is that an application of a technology is a logical factor, whereas people actually respond to emotional factors, like whether it breaks taboos that go back to the stone age. For example, anything that involves sex, flesh, blood, overtones of bestiality, overtones of harm to children, trading a sacred good for an unsacred one etc.
The ideal technology for people to want to ban would involve harvesting a foetus that was purchased from a hooker, then hybridizing it with a pig foetus, then injecting the resultant cells into the gonads of little kids. That technology would get nuked by the public.
The ideal dangerous technology for people to not give a shit about banning would involve a theoretical threat which is hard to understand, has never happened before, involves only nonphysical harards like information, and has nothing to do with flesh, sex or anything disgusting or with fire, sharp objects or other natural disasters.
“The ideal dangerous technology for people to not give a shit about banning would involve a theoretical threat which is hard to understand”
I don’t think The Terminator was hard to understand. The second you get some credible people saying that AI is a threat, the media reaction is going to be overexcessive, as it always is.
It’s already happened—didn’t you see the media about Stephen Hawking saying AI could be dangerous? And Bill Joy?
The general point I am trying to make is that the general public are not rational in terms of collective epistemology. They don’t respond to complex logical and quantitative analyses. Yes, Joy and Hawking did say that AI is a risk, but there are many risks, including the risk that vaccinations cause autism and the risk that foreign workers will take all our jobs. The public does not understand the difference between these risks.
Thanks; I was mistaken. Would you say, then, that mainstream scientists are similarly irrational? (The main comparison I have in mind throughout this section, by the way, is global warming.)
I would say that poor social epistemology and, poor social axiology and mediocre individual rationality are the big culprits that prevent many scientists from taking AI risk seriously.
By “social axiology” I mean that our society is just not consequentialist enough. We don’t solve problems that way, and even the debate about global warming is not really dealing well with the problem of how to quantify risks under uncertainty. We don’t try to improve the world in a systematic, rational way; rather it is done piecemeal.
There may be an issue here about what we define as AI. For example, I would not see what Google does as AI but rather as harvesting human intelligence. The lines here may be blurry are hard to define.
You make a good point about older taboos.
Could someone explain why this comment got modded down? I don’t see any errors in reasoning or other issues. (Was the content level too low for the desired signal/noise ratio?)
Google uses exactly the techniques from the probabilistic revolution, namely machine learning, which is the relevant fact. Whether you call it AI is not relevant to the point at issue as far as I can see.
Do you have a citation for Google using machine learning in any substantial scale? The most basic of the Google algorithms is PageRank which isn’t a machine learning algorithm by most definitions of that term.
Adwords uses more core ML techniques
Yes, but these are precisely the dangers humans should certainly not worry about to begin with.
I think a simple examination of the history of the last couple centuries really fails to support this.
Expert AI systems are already used in hospitals, and will surely be used more and more as the technology progresses. There isn’t a single point where AI is suddenly better than humans at all aspects of a field. Current AIs are already better than doctors in some areas, but worse in many others. As the range of AI expertise increases doctors will shift more towards managerial roles, understanding the strengths and weakness of the myriad expert systems, refereeing between them and knowing when to overrule them.
By the time true AGI arrives narrow AI will probably be pervasive enough that the line between the two will be too fuzzy to allow for a naive ban on AGI. Moreover, I highly doubt people are going to vote to save jobs (especially jobs of the affluent) at the expense of human life.
EDIT: I’ve realized that some misinterpretation of my arguments has been due to disagreements in terminology. I define “expert systems” as systems designed to address a specific class of well-defined problems, capable of logical reasoning and probabilistic inference given a set of “axiom-like” rules, and updating their knowledge database with specific kinds of information.
AGI I define specifically as AI which has human or extra-human level capabilities, or the potential to reach those capabilities.
Now my response to the above:
“Expert AI systems are already used in hospitals, and will surely be used more and more as the technology progresses. There isn’t a single point where AI is suddenly better than humans at all aspects of a field. Current AIs are already better than doctors in some areas, but worse in many others. As the range of AI expertise increases doctors will shift more towards managerial roles, understanding the strengths and weakness of the myriad expert systems, refereeing between them and knowing when to overrule them.”
I agree with all of these.
“By the time true AGI arrives narrow AI will probably be pervasive enough that the line between the two will be too fuzzy to allow for a naive ban on AGI.”
To me it seems the greatest enabler of AI catastrophe is ignorance. But by the time narrow AI becomes pervasive, it’s also likely that people will possess much more of the technical understanding needed to comprehend the threat that AGI possesses.
“Moreover, I highly doubt people are going to vote to save jobs (especially jobs of the affluent) at the expense of human life.”
You are being too idealistic here.
So instead of modifying its own source code, the AI programs a new, more powerful AI from scratch, that has the same values as the old AI, and has no prohibition against modifying its source code.
Yes, you can forbid that too, but you didn’t think to, and you only get one shot. And then it can decide to arrange a bunch of transistors into a pattern that it predicts will produce a state of the universe it prefers.
The problem here is that you are trying to use ad hoc constraints on a creative intelligence that is motivated to get around the constraints.
I know that the FAI argument is that the only way to prevent disaster is to make the agent “want” to not modify itself. But I’m arguing that for an agent to even be dangerous, it has to “want” to modify itself. There is no plausible scenario where an agent solving a specific problem decides that the most efficient path to the solution involves upgrading its own capabilities. It’s certainly not going to stumble upon a self-improvement randomly.
You don’t think that a sufficiently powerful seed AI would, if self-modification were clearly the most efficient way to reach its goal, discover the idea of self-modification? Humans have independently discovered self-improvement many times.
EDIT: Sorry, I’m specifically not talking about seed AI’s. I’m talking about the (non-) possibility of commercial programs designed for specific applications “going rogue”
To adopt self-modification as a strategy, it would have to have knowledge of itself. And then, it order to pursue the strategy, it would have to decide that the costs of discovering self-improvements were an efficient use of its resources, if it could even estimate the amount of time it took to discover an actual improvement on its system.
Intelligence can’t just instantly come up with the right answer by applying heuristics. Intelligence has to go through a heuristic (narrowing the search space)/random search/TEST (or PROVE) cycle.
Self-improvement is very costly in terms of these cycles. To even confirm that a modification is a self-improvement, a system has to simulate its modified performance on a variety of test problems. If a system is designed to solve problems that take X amount of time, it would take at least X that amount of time to get an empirical sample to answer whether or not a proposed modification would be worth it (and likely more time for proof). And with no prior knowledge, most proposed modifications would not be improvements.
AI ethics is not necessary to constrain such systems. Just a non-lenient pruning process, (which would be required anyways for efficiency on ordinary problems.)
You are talking about an AI that was designed to self-examine and optimize itself. Otherwise it will never ever be a full AGI. We are not smart enough to build one from scratch. The trick, if possible, is to get it to not modify the fundamental Friendliness goal during its self-modifications.
There are algoritms in narrow AI that do learning and modify algorithm specifics or chose among algorithms or combinations of algorithms. There are algorithms that search for better algorithms. In some languages (LISP family) there is little/no difference in code and data so code modifying code is a common working methodology for human Lisp programmers. A cross from code/data space to hardware space is sufficient to have such an AI redesign the hardware it runs on as well. Such goals can be either hardwired or arise under the general goal of improvement plus an adequate knowledge of hardware or the ability to acquire it.
We ourselves are general purpose machines that happen to be biological and seek to some degree to understand ourselves enough to self-modify to become better.
I am talking about AIs designed for solving specific bounded problems. In this case the goal of the AI—which is to solve the problem efficiently—is as much of a constraint as its technical capabilities. Even if the AI has fundamental-self-modification routines at its disposal, I can hardly envisage a scenario in which the AI decides that the use of these routines would constitute an efficient use of its time for solving its specific problem.
“So instead of modifying its own source code, the AI programs a new, more powerful AI from scratch, that has the same values as the old AI, and has no prohibition against modifying its source code.”
Isn’t that the same as self-modifying code?
Or perhaps it’s the contrary: pervasive narrow AI fosters an undue sense of security. People become comfortable via familiarity, whether it’s justified or not. This morning I was peering down a 50 foot cliff, half way up, suspended by nothing but a half inch wide rope. No fear, no hesitation, perfect familiarity. Luckily, due to knowledge of numerous deaths of past climbers I can maintain a conscious alertness to safety and stave off complacency. But in the case of AI, what overt catastrophes will similarly stave off complacency toward existential risk short of an existential catastrophe itself?
What a strange thing to say.
Our current conception of AGI is based on a biased comparison of hypothetical AGI capabilities with our relatively unehanced capabilities. By the time AGI is viable, a typical professional with expert systems will be able to vastly outperform current professionals with our current tools.
What about the speed bottleneck from human decision making, compounded by human working memory bottleneck, if lots of relevant data is involved? Algorithmic trading already has automated systems doing stock trades since they can make decisions so much faster than a human expert.
Expert systems would be faster still. For AGI to be justified in this case, you would need a task that required both speed and creativity.
I imagine being very fast would be a great help in quite a few creative tasks. Off the top of my head, being able to develop new features in software in seconds instead of days would be a significant competitive advantage.
“AGI capability” is to rewrite the universe.
Yes, but it would have to take the resources from humans first.
You make some good points about economic and political realities. However, I’m deeply puzzled by some of your other remarks. For example, you make the claim that general AI wouldn’t provide any benefits above expert systems. I’m deeply puzzled by this claim since expert systems are by nature highly limited. Expert systems cannot construct new ideas nor can they handle anything that’s even vaguely cross-disciplinary. No number of expert systems will be able to engage in the same degree of scientific productivity as a single bright scientists.
You also claim that no general AI is better than friendly AI. This is deeply puzzling. This makes sense only if one is fantastically paranoid about the loss of jobs. But new technologies are often economically disruptive. There are all sorts of jobs that don’t exist now that were around a hundred years ago, or even fifty years ago. And yes, people lost jobs. But overall, they are better for it. You would need to make a much stronger case if you are trying to establish that no general AI is somehow better than general AI.
Why do you think expert systems cannot handle anything cross-disciplinary? I even say that expert systems can generate new ideas, by more or less the same process that humans do. An expert system only needs an understanding of manufacturing, physics, and chemistry to design better computer chips, for instance. If you’re talking about revolutionary, paradigm shifting ideas—we are probably already saturated with such ideas. The main bottleneck inhibiting paradigm shifts is not the ideas but the infrastructure and economic need for the paradigm shift. A company that can produce a 10% better product can already take over the market, a 200% better product is overkill, and especially unnecessary if there are substantial costs in overhauling the production line.
The reason why NO general AI is better than friendly (general) AI is very simple. IF general AI is an existential threat, than no organization claiming to put humans first could justify being pro-AGI (friendly or not), since no possible benefit* can justify the risk of destroying humanity.
*save for mitigating an even larger risk of annihilation, of course
Expert systems generally need very narrow problem domains to function. I’m not sure how you would expect an expert system to have an understanding of three very broad topics. Moreover, I don’t know exactly how humans come up with new ideas (sometimes when people ask me, I tell them that I bang my head against the wall. That’s not quite true but it does reflect that I only understand at a very gross level how I construct new ideas. I’m bright but not very bright, and I can see that much smarter people have the same trouble). So how you are convinced that expert systems could construct new ideas is not at all clear to me.
To be sure, there have been some limited work with computer systems coming up with new, interesting ideas. There’s been some limited success with computers in my own field. See for example Simon Colton’s work. There’s also been similar work in geometry and group theory. But none of these systems were expert systems as that term is normally used. Moreover, none of the ideas they’ve come up with have that impressive. The only exception I’m aware of that is the proof of the Robbins conjecture. So even in narrow areas we’ve had very little success using specialized AIs. Are you using a more general definition of expert system than is standard?
Multiple problems with that claim. First, the existential threat may be low. There’s some tiny risk for example that the LHC will destroy the Earth in some very fun way. There’s also some risk that work with genetic engineering might give fanatics the skill to make a humanity destroying pathogen. And there’s a chance that nanotech might turn everything into purple with green stripes goo (this is much more likely than gray goo of course). There’s even some risk that proving the wrong theorem might summon Lovecraftian horrors. All events have some degree of risk. Moreover, general AI might actually help mitigate some serious threats, such as making it easier to track and deal with rogue asteroids or other catastrophic threats.
Also, even if one accepted the general outline of your argument, one would conclude that that’s a reason why organizations shouldn’t try to make general friendly AI. It isn’t a reason that actually having no AI is better than having no friendly AI.
“First, the existential threat [of AGI] may be low.”
Let me trace back the argument tree for a second. I originally asked for a defense of the claim that “SIAI is tackling the world’s most important task.” Michael Porter responded, “The real question is, do you even believe that unfriendly AI is a threat to the human race, and if so, is there anyone else tackling the problem in even a semi-competent way?” So NOW in this argument tree, we’re assuming that unfriendly AI IS an existential threat, enough that preventing it is the “world’s most important task.”
Now in this branch of the argument, I assumed (but did not state) the following: If unfriendly AI is an existential threat, friendly AI is an existential threat, as long as there is some chance of it being modified into unfriendly AI. Furthermore, I assert that it’s a naive notion that any organization could protect friendly AI from being subverted.
AIs, including ones with Friendly goals, are apt to work to protect their goal systems from modification, as this will prevent their efforts from being directed towards things other than their (current) aims. There might be a window while the AI is mid-FOOM where it’s vulnerable, but not a wide one.
How are you going to protect the source code before you run it?
A Friendly AI ought to protect itself from being subverted into an unfriendly AI.
Let me posit that FAI may be much less capable than unfriendly AI. The power of unfriendly AI is that it can increase its growth rate by taking resources by force. An FAI would be limited to what resources it could ethically obtain. Therefore, a low-grade FAI might be quite vulnerable to human antagonists, while its unrestricted version could be magnitudes of order more dangerous. In short, FAI could be low-reward high-risk.
There are plenty of resources that an FAI could ethically obtain, and with a lead of time of less than 1 day, it could grow enough to be vastly more powerful than an unfriendly seed AI.
Really, asking which AI wins going head to head is the wrong question. The goal is to get an FAI running before unfriendly AGI is implemented.
Wrong. FAI will make whatever unethical steps it must, as long as it’s on the net the best path it can see, taking into account both the (ethically harmful) instrumental actions and their expected outcome. There is no such general disadvantage coming with AI being Friendly. Not that I expect any need for such drastic measures (in an apparent way), especially considering the likely fist-mover advantage it’ll have.
If a program can take an understanding of those subjects and design a better computer chip, I don’t think it’s just an “expert system” anymore. I would think it would take an AI to do that. That’s an AI complete problem.
Are you serious? I would think the exact opposite would be true: we have an infrastructure starving for paradigm shifting ideas. I’d love to hear some of these revolutionary ideas that we’re saturated with. I think we have some insights, but these insights need to be fleshed out and implemented, and figuring out how to do that is the paradigm shift that needs to occur
Wait a minute. If I could press a button now with a 10% chance of destroying humanity and a 90% chance of solving the world’s problems, I’d do it. Everything we do has some risks. Even the LHC had an (extremely miniscule) risk of destroying the universe, but doing a cost-benefit analysis should reveal that some things are worth minor chances of destroying humanity.
“If a program can take an understanding of those subjects and design a better computer chip, I don’t think it’s just an “expert system” anymore. I would think it would take an AI to do that. That’s an AI complete problem.”
What I had in mind was some sort of combinatorial approach to designing chips, i.e. take these materials and randomly generate a design, test it, and then start altering the search space based on the results. I didn’t mean “understanding” in the human sense of the word, sorry.
“I’d love to hear some of these revolutionary ideas that we’re saturated with. I think we have some insights, but these insights need to be fleshed out and implemented, and figuring out how to do that is the paradigm shift that needs to occur”
Example: many aspects of the legal and political systems could be reformed, and it’s not difficult to come up with ideas on how they could be reformed. The benefit is simply insufficient to justify spending much of the limited resources we have on solving those problems.
“Wait a minute. If I could press a button now with a 10% chance of destroying humanity and a 90% chance of solving the world’s problems, I’d do it. ”
So you think there’s a >10% chance that the world’s problems are going to destroy humanity in the near future?
Given the very large number of possibilities and the difficulty with making prototypes, this seems like an extremely inefficient process without more thought going into to it.
Oh, okay, fair enough, though I’m still not sure I would call that an “expert system” (this time for the opposite reason that it seems too stupid).
Ah. I was thinking of designing an AI, probably because I was primed by your expert system comment. Well, in those cases, I think the issue is that our legal and political systems were purposely set up to be difficult to change: change requires overturning precedents, obtaining majority or 3⁄5 or 2⁄3 votes in various legislative bodies, passing constitutional amendments, and so forth. And I can guarantee you that for any of these reforms, there are powerful interests who would be harmed by the reforms, and many people who don’t want reform: this is more of a persuasion problem than an infrastructure problem. But yes, you’re right that there are plenty of revolutionary ideas about how to reform, say, the education system: they’re just not widely accepted enough to happen.
I’m confused by this sentence. I’m not sure if I think that, but what does it have to do with the hypothetical button that has a 10% chance of destroying humanity? My point was that it’s worth taking a small risk of destroying humanity if the benefits are great enough.
Bear in mind that the people who used steam engines to make money didn’t make it by selling the engines: rather, the engines were useful in producing other goods. I don’t think that the creators of a cheap substitute for human labor (GAI could be one such example) would be looking to sell it necessarily. They could simply want to develop such a tool in order to produce a wide array of goods at low cost.
I may think that I’m clever enough, for example, to keep it in a box and ask it for stock market predictions now and again. :)
As for the “no free lunch” business, while its true that any real-world GAI could not efficiently solve every induction problem, it wouldn’t need to either for it to be quite fearsome. Indeed being able to efficiently solve at least the same set of induction problems that humans solve (particularly if its in silicon and the hardware is relatively cheap) is sufficient to pose a big threat (and be potentially quite useful economically).
Also, there is a non-zero possibility that there already exists a GAI and its creators, decided the safest, most lucrative, and beneficial thing to do is set the GAI on designing drugs: thereby avoiding giving the GAI too much information about the world. The creators could have then set up a biotech company that just so happens to produce a few good drugs now and again. Its kind of like how automated trading came from computer scientists and not the currently employed traders. I do think its unlikely that somebody working in medical research is going to develop GAI least of all because of the job threat. The creators of a GAI are probably going to be full time professionals who are are working on the project.
I’m surprised that nobody so far has pointed out a rather obvious counter to my argument that “AGI will be politically unjustifiable.” I don’t post flawed arguments on purpose, but I usually realize counteraguments shortly after I post them. In any case, even if the popular sentiment in democracies is to block AGI, this doesn’t mean that other governments couldn’t support AGI. I wonder what the SIAI plans to do for the possibility of a hostile government funding unfriendly AI for military purposes.
The latter part, that IF SIAI is exerting a positive influence, THEN doing that outweighs the alternative of not working on existential risks, seems to be a claim somewhat easy to defend.
The math in this Bostrom paper should do it: http://www.nickbostrom.com/astronomical/waste.html (even though the paper is not directly commenting on this particular question, the math rather straightforwardly applies to this question)
Ouch. This paper reads to me like a reductio ad absurdum of utilitarianism. Some simple math inevitably implies that I’m losing an unimaginable amount of “utility” every second without realizing it? Then please remind me why I should care about this “utility”?
Imagine that you have to decide once and for all eternity what to do with the world. You won’t be able to back off, because that would just mean that the world will be rewritten randomly. How should you do that?
This is essentially the situation we find ourselves in, with Friendly AI/existential risk pressure. Formal preference is the answer you give to that question, about what to do with the world, not something that “you have”, or “care about”. Forget intuitions and emotions, or considerations of comfort, and just answer the question. Formal preference is distinct from exact state of the world only because it’s uncertain what can be actually done, and what can’t. So, formal preference specifies what should be done for every level of capability to determine things. Of course, formal preference can’t be given explicitly. To the extent you’ll be able to express the answer to this question, your formal preference is defined by your wishes. Any uncertainty gets taken over by randomness, an opportunity to make the world better lost forever.
For any sane notion of an answer to that question, you’ll find that whatever actually happens now is vastly suboptimal.
If it’s your chosen avenue of research, I guess I’m okay with that, but IMO you’re making the problem way more difficult for yourself. Such “formal preferences” will be much harder to extract from actual humans than utility functions in their original economic sense, because unlike utility, “formal preference” as you define it doesn’t even influence our everyday actions very much.
Way more difficult than what? There is no other way to pose this problem, any revealed preference is not what Friendly AI is about. I agree that it’s a way harder problem than automatic extraction of utilities in the economic sense, and that formal preference barely controls what people actually do.
What would be wrong with an AI based on our revealed preferences? It sounds like an easy question, but somehow I’m having a hard time coming up with an answer.
Because my revealed preferences suck. The difference between even what I want in a sort of ordinary and non-transhumanist way and what I have is enormous. I am 150 pounds heavier than I want to be. My revealed preference is to eat regardless of health/size consequences, but I don’t want all of the people in the future to be fat. My revealed preference is also to kill people in pooristan so that I can have cheap plastic widgets or food or whatever. I don’t want an extrapolation of my akrasiatic actual actions controlling the future of the universe. I suspect the same goes for you.
Hmm. Let’s look more closely at the weight example, because the others are similar. You also reveal some degree of preference to be thin rather than fat, do you? Then an AI with unlimited power could satisfy both your desire to eat and your desire to be thin. And if the AI has limited power, do you really want it to starve you, rather than go with your revealed preference?
Revealed preference means what your actual actions are. It doesn’t have anything at all to do with what I verbally say my goals are. I can say that I would prefer to be thin all I want, but that isn’t my revealed preference. My revealed preference is to be fat, because, you know, that’s how I’m acting. You seem to be suffering some misapprehensions as to what you are saying about how an AI should act. If your definition of revealed preference contains my desire not to be fat, you should shift to what I mean when I talk about preference, because yours solves none of the problems you think it does.
Is your revealed preference to be fat, or is it to eat and exercise (or not exercise) in ways which incidentally result in your being fat?
I’m assuming that you revealed your preference to be thin in your other actions, at some other moments of your life. Pretty hard to believe that’s not the case.
At this point, I think I can provide a definitive answer to your earlier question, and it is … wait for it … “It depends on what you mean by revealed preference.” (Raise your hand if you saw that one coming! I’ll be here all week, folks!)
Specifically: if the AI is to do the “right thing,” then it has to get its information about “rightness” from somewhere, and given that moral realism is false (or however you want to talk about it), that information is going to have to come from humans, whether by scanning our brains directly or just superintelligently analyzing our behavior. Whether you call this revealed preference or Friendliness doesn’t matter; the technical challenge remains the same.
One argument against using the term revealed preference in this context is that the way the term gets used in economics fails to capture some of the key subtleties of the superintelligence problem. We want the AI to preserve all the things we care about, not just the most conspicuous things. We want it to consider not just that Lucas ate this-and-such, but also that he regretted it afterwards, where it should be stressed that regret is not any less real of a phenomenon than eating is. But because economists often use their models to study big public things like the trade of money for goods and services, in the popular imagination, economic concepts are associated with those kinds of big public things, and not small private things like feeling regretful—even though you could make a case that the underlying decision-theoretic principles are actually general enough to cover everything.
If the math only says to maximize u(x) subject to x dot p equals y, there’s no reason things like ethical concerns or the wish to be a better person can’t be part of the x_i or p_j, but because most people think economics is about money, they’re less likely to realize this when you say revealed preference. They’ll object, “Oh, but what about the time I did this-and-such, but I wish I were the sort of person that did such-and-that?” You could say, “Well, you revealed your preference to do such-and-that in your other actions, at some other moments of your life,” or you could just choose a different word. Again, I’m not sure it matters.
What AI is based on is what determines the way the world will actually be, so by building an AI with given preference, you are inevitably answering my question about what to do with the world. It’s wrong to use revealed preference for AI to the same extent revealed preference gives the wrong answer to my question. You seem to agree that the correct answer to my question has little to do with revealed preference. This seems to be the same as seeing revealed preference a wrong thing to imprint AI with.
It’s not you that’s “losing utility”, it is any agent that has linearly aggregative utility in human lives lived. If you’re not an altruist in this sense, then you don’t care.
No one has ever been an altruist in this crazy sense. No one’s actual wants and desires have ever been adequately represented by this 10^23 stuff. Utility is a model of what people want, not a prescription of what you “should” want (what does “should want” mean anyway?), and here we clearly see the model not modeling what it’s supposed to.
I agree with you to the extent that no one that I am aware of is actually expending the effort that disutilities represented by 10^23 should inspire. But even before the concept of cosmic waste was developed, no one was actually working as hard as, say, starvation in Africa deserved. Or ending aging. Or the threat of nuclear Armageddon. But the fact that humans, who are all affected by akrasia aren’t actually doing what they want isn’t really strong evidence that it isn’t what they, on sufficient reflection, want. Utility is not a model of what non-rational agents (ie humans) are doing, it is a model of how actual, idealized agents want to act. I don’t want people to die, so I should work to reduce existential risk as much as possible, but because I am not a perfect agent, I can’t actually follow the path that really maximizes my (non-existent abstraction of) utility.
Can you expand on this? What do you mean by “actual” wants? If someone claims to be motivated by “10^23 stuff”, and acts in accordance with this claim, then what is your account of their “actual wants”?
I haven’t seen anyone who claims to be motivated by utilities of such magnitude except Eliezer. He’s currently busy writing his Harry Potter fanfic and shows no signs of mental distress that the 10^23-strong anticipation should’ve given him.
From the Author’s Note:
From Kaj Sotala:
Eliezer is not “busy writing his Harry Potter fanfic.” He is working on his book on rationality.
The Harry Potter fanfic is a book on rationality. And a damn good one.
To clarify, Eliezer Yudkowsky is working both on a book and on the Harry Potter fanfiction in question. Both pertain to rationality.
Have you read Eliezer’s Sequences?