He will probably start arguing for the correct position once it is supported enough as not to be destroyed by incorrect positions.
The main thing I can recall from from the 2008 debate mentioned was Hanson’s position being essentially destroyed by Hanson via support that made no sense, making Eliezer largely redundant.
Hanson’s position being essentially destroyed by Hanson via support that made no sense...
As far as I can tell Hanson does not disagree with Yudkowsky except for the probability of risks from AI. Yudkowsky says that existential risks from AI are not under 5%. Has Yudkowsky been able to support this assertion sufficiently? Hanson only needs to show that it is unreasonable to assume that the probability is larger than 5% and my personal perception is that he was able to do so.
I have already posted various arguments for why I believe that the case for risks from AI, and especially recursive self-improvement (explosive), is not as strongly supported as some people seem to think. I haven’t come across a good refutation, or a strong argument to the contrary.
You are of course free to suspect and declare that I was already given sufficient evidence and that my motives are not allowing me to admit that I am wrong. I won’t perceive anything like that as a personal attack and ask everyone not to downvote you for any such statements. Think I am a troll or an idiot, let me know, I want to know :-)
The debate seems to be about the probability of the specific “AI in a basement goes foom” scenario. There are surely other existential risk scenarios which seem more Hansonian, e.g. firms gradually replacing all of their human capital with computers and robots until there are no humans left?
Hanson’s arguments here don’t apply to nation-states that start off with a large portion of world research resources and stronger ability to keep secrets, as we saw with nuclear weapons. He has a quite separate argument against that, which is that governments are too stupid to notice early brain emulation or AI technology and recognize that it is about to turn the world upside down.
I don’t buy it at that level of confidence. Robin says the Manhattan Project was an anomaly in wartime, and that past efforts to restrict the spread of technologies like encryption and supercomputers didn’t work for long (I’d say the cost-benefit for secrecy here was much, much worse than for human-level AI/WBE). My reply is that a delay of 4 years, like that between the US and Soviet nuclear tests, would be a long, long time for WBE or human-level AI to drive a local intelligence explosion. Software is easier to steal, but even so.
Your reply is focused on keeping secrets. I meant my comment to apply to the second claim—the one about governments being “too stupid”. That claim might be right—but it is not obvious. Government departments focused on this sort of thing (of which there are several) will understand—and no doubt already understand. The issue is more whether the communication lines are free, whether the top military brass take thier own boffins seriously—and whether they go on to get approval from head office.
As for secrecy—the NSA has a long history of extreme secrecy. The main reason most people don’t know about their secret tech projects is because their secrecy is so good. If they develop a superintelligence, I figure it will be a secret one that will probably remain chained up in their basement. They are the main reason, my graph has some probability mass already.
A while back I posted this minimalist account of Eliezer’s case for the importance of FAI to human survival. (Claim B technically seems too specific if you want to talk about the existential risk as a whole, but I think it reflects his view.)
So far I can’t tell if you agree that each claim has easily more than .5 probability given the evidence, nor if you share my view that Claim A as separate from the rest has P close to 1. In particular, you said here that you believe:
if you waste too much time with spatiotemporal bounded versions then someone who is ignorant of friendliness will launch one that isn’t constrained that way.
By the same principle, the speed or slowness of FOOM doesn’t matter in the long run unless some force with the power to stop it does so, and unless this happens every single time someone creates an unFriendly AI with the power to self-modify. I have almost no confidence that humanity in general will learn from past mistakes (and precious little confidence in the subset that could write the second or third AGI). So I think we need to look at the cumulative chance for Claim B, Claim C, and perhaps even D.
Even so, it seems possible that the actual risk stays within 5%. Maybe you think some form of FAI, such as Friendly uploads, will prove easy once we get the capacity for some form of AGI. Maybe you think we seem likely to kill ourselves with X before then. Maybe you think some other force(s) will stop each and every AGI. If so, I’d like to hear your reasoning.
And if not, if you want to argue against my claims in some other way, please do so without identifying them with a more specific storyline.
ETA: I apparently forgot how to use links. I believe this means I should go eat or sleep. Take that as you will.
So far I can’t tell if you agree that each claim has easily more than .5 probability given the evidence, nor if you share my view that Claim A as separate from the rest has P close to 1.
The whole dispute is about your claim A. It gives lot of credence to Y’s idea of where things are headed (someone is going to write a single AI that takes over the world) and none to H’s (someone is going to upload some humans and make trillions of copies). Those are two very different possibilities with different consequences, and there’s no reason to believe it’s close to an exhaustive list of plausible scenarios.
Actually, I wrote Claim A in such a way that it applies to uploads. Near as I can tell H disagrees with B and/or D, or else he believes that FAI will ‘emerge’ naturally (e.g. from interactions between uploads).
Well, if we understand A to include ems, then your B, C and D are misleadingly worded. They all speak of a single AI. If you reworded them to allow for ems, then I guess I would agree that the dispute is located there. But I would be interested to see such a rewording—I think it would be favorable to Hanson’s point of view.
It’s also not clear to me that “single powerful AI or trillions of uploads” together add up to p > .5! But that might be off-topic.
Favorable? I don’t know why you’d think that. Seems to me the charitable interpretation of Hanson’s view has him thinking of ems as naturally Friendly, or near-Friendly. (My analysis didn’t mention the chance of us getting FAI without working for it.)
If we get two unFriendly AIs that individually have the power to kill humanity, and if acting quickly means they don’t have to negotiate with anyone else from this planet, they’ll divide Earth between them. If we somehow get trillions of uFAIs with practically different goals, then of course the expected value of killing humanity goes way down. But it still sounds greater than the expected value of cooperating with us, by Hanson’s analysis. And if we get one FAI out of a trillion AGIs, I think that leads to either war or a compromise like the one the Super-Happies offered the Babyeaters. We might get a one-trillionth slice of the available matter, with no more thought given to (say) the aesthetics of the Moon than we give to any random person who’d like to see his name there in green neon every night. (Still a better deal than we’d offer any one upload according to Hanson. But maybe I’ve misunderstood him?)
I also don’t understand how you get that much memory and processing power without some designer that seems awfully close to an AI-programming AI. But as a layman I may be thinking of that in the wrong way.
Oh, and I lean towards P(genocide) somewhat under .4 without FAI theory. Right now I’m just arguing that it exceeds .05 per XiXiDu’s comment. You may have misread his “5%” there.
Favorable? I don’t know why you’d think that. Seems to me the charitable interpretation of Hanson’s view has him thinking of ems as naturally Friendly, or near-Friendly
You would have to tell me what friendly and unfriendly means in this context. Hanson expects ems to be very numerous and very poor. I doubt he expects any one of them to have the resources available to what’s usually called an fai. Is a human being running at human speeds F or UF?
If we somehow get trillions of uFAIs with practically different goals, then of course the expected value of killing humanity goes way down. But it still sounds greater than the expected value of cooperating with us, by Hanson’s analysis.
I don’t think the notion of “cooperating with us” is coherent. Just as the trillions of ems might have practically different goals, so might the billions of live humans.
I also don’t understand how you get that much memory and processing power without some designer that seems awfully close to an AI-programming AI.
Possibly, being poor, they would not have that much memory and processing power.
Taking the last part first for context: this layman thinks that just simulating a conscious brain (experiencing something other than pure terror or slow insanity) would take a lot of resources using the copy-an-airplane-with-bullet-holes approach where you don’t know what the parts actually do, at least not well enough to make a self-reflective programming AI from scratch.
As to the rest, I’m assuming my previous claims hold for the case of a single AGI because you seemed to argue that simply introducing a lot more AGIs changes the argument. (“Cooperating with us” therefore means not killing us all.) I started out by granting that the nature of the AIs could make a big difference. The number seems almost irrelevant. It seems like you’re arguing for the possibility that no single em would have enough resources to produce super-intelligence (making assumptions about what that requires), since they might find themselves sharing a medium with trillions of competing ems before they get that far. But this appears to mean that giving more resources to any one of them (or any group with consistent goals) could easily produce super-intelligence. Someone would eventually do this. Indeed, Hanson seems to argue that a workforce of ems would help produce better technology and thus better ems.
I do have to address the possibility that the normal ems themselves could stop a self-modifying AI because they would think faster than ordinary humans. That situation would certainly decrease the risk of killing humanity. But again, for that to make sense you have to assume that effective self-modification requires vast resources (or you just get trillions of self-modifiers by assumption). You may also need to assume that a super-intelligence needs even more resources to work out a plan for killing us—otherwise the rest of the ems would seemingly have no way to discover the plan before it went into motion, except by chance. (A superior intelligence would try to include their later actions in the plan.) Note that even these assumptions do not yield near-certainty of survival given uFAI, not with the observed stupidity of humanity. Seems like you’d at least need the additional assumption that no biological human who the uFAI can reach and fool has the power to trigger our demise.
And then of course we have the relative difficulty of emulation and new designs for reflective AI. It took no time at all to find someone arguing for the impossibility of the former, on the grounds that normal emulation requires knowing what the original does well enough to copy one part and not another. If we get that knowledge it increases the likelihood of new AGI—indeed, it almost seems to require making ‘narrow AIs’ along the way, and by assumption this happens before we know what each one can do.
My main reason for doubting this part, however, lies in the fact that it suggests we can avoid otherwise difficult work, and the underlying belief seems to have grown in popularity along with our grasp of said difficulty.
And if not, if you want to argue against my claims in some other way, please do so without identifying them with a more specific storyline.
That’s one of the main problems I have with the whole existential risks prediction business. There is a specific storyline, it is comprised in the vagueness of your claims. If you tried to pin down a concept like ‘recursive self-improvement’ that supports the notion of an existential risk, you would end up with an argument that is strongly conjunctive. Most of the arguments in favor of risks from AI derive their appeal from vagueness, that doesn’t mean that they are disjunctive.
Yes, I said that I believe that even sub-human level AI pose an existential risk. At the same time I am highly skeptical of FOOM. So why don’t I agree with Eliezer outright anyway? Because the risks from AI that I perceive to be a possibility are not something you can solve by inventing provable “friendliness”. How are you going to make a sophisticated monitoring system friendly? Why would people want to make it friendly? How are you going to make a virus with sub-human level general intelligence friendly? Why would one do that? Risks from AI are a broad category that need meta-solutions that involve preemptive political and security measures. You need to make sure that the first intelligent surveillance systems are employed transparently and democratically so that everyone can monitor the world for the various risks ahead. We need a global immune system that keeps care that nowhere someone gets ahead of everyone else.
Have you taken your own survey and published the results somewhere? Or is it only for AI researchers? It seems like there are a great deal of hidden assumptions on all sides which make these discussions go off the tracks very quickly. Some kind of basic survey with standard probability estimates might easily show where views differ.
So why don’t I agree with Eliezer outright anyway? Because the risks from AI that I perceive to be a possibility are not something you can solve by inventing provable “friendliness”.
I agree that friendliness is a long shot. If you know of a better solution, please let me know.
How are you going to make a sophisticated monitoring system friendly?
By developing a theory of friendliness and implementing it in software.
Why would people want to make it friendly?
Because unfriendly things are bad.
How are you going to make a virus with sub-human level general intelligence friendly?
By developing a theory of friendliness and implementing it in software.
Why would one do that?
Because unfriendly things are bad.
Risks from AI are a broad category that need meta-solutions that involve preemptive political and security measures. You need to make sure that the first intelligent surveillance systems are employed transparently and democratically so that everyone can monitor the world for the various risks ahead. We need a global immune system that keeps care that nowhere someone gets ahead of everyone else.
Sounds like a job for CEV and a friendly AI to run it on.
Have you taken your own survey and published the results somewhere?
Yes I have done so. But I don’t trust my ability to make correct probability estimates, don’t trust the overall arguments and methods and don’t know how to integrate that uncertainty into my estimates. It is all too vague.
There sure are a lot of convincing arguments in favor of risks from AI. But do arguments suffice? Nobody is an expert when it comes to intelligence. Even worse, I don’t think anybody knows much about artificial general intelligence.
My problem is that I fear that some convincing blog posts are simply not enough. Just imagine all there was to climate change was someone with a blog who never studied the climate but instead wrote some essays about how it might be physical possible for humans to cause a global warming. Not enough, the same person then goes on to make further inferences based on the implications of those speculations. Am I going to tell everyone to stop emitting CO2 because of that? Hardly! Or imagine that all there was to the possibility of asteroid strikes was someone who argued that there might be big chunks of rocks out there which might fall down on our heads and kill us all, inductively based on the fact that the Earth and the moon are also a big rocks. Would I be willing to launch a billion dollar asteroid deflection program solely based on such speculations? I don’t think so. Luckily, in both cases, we got a lot more than some convincing arguments in support of those risks.
Another example: If there were no studies about the safety of high energy physics experiments then I might assign a 20% chance of a powerful particle accelerator destroying the universe based on some convincing arguments put forth on a blog by someone who never studied high energy physics. We know that such an estimate would be wrong by many orders of magnitude. Yet the reason for being wrong would largely be a result of my inability to make correct probability estimates, the result of vagueness or a failure of the methods I employed. The reason for being wrong by many orders of magnitude would have nothing to do with the arguments in favor of the risks, as they might very well be sound given my epistemic sate and the prevalent uncertainty.
In summary: I believe that mere arguments in favor of one risk do not suffice to neglect other risks that are supported by other kinds of evidence. I believe that logical implications of sound arguments should not reach out indefinitely and thereby outweigh other risks whose implications are fortified by empirical evidence. Sound arguments, predictions, speculations and their logical implications are enough to demand further attention and research, but not much more.
I agree that friendliness is a long shot. If you know of a better solution, please let me know.
If there was a risk that might kill us with a probability of .7 and another risk with .1 while our chance to solve the first one was .0001 and the second one .1, which one should we focus on?
Why do I feel like there’s massively more evidence than “a few blog posts”? I must be counting information I’ve gained from other studies, like those on human history, and lumping it all under “what intelligent agents can accomplish”. I’m likely counting fictional evidence, as well; I feel sort of like an early 20th century sci-fi buff must have felt about rockets to the moon. Another large part of being convinced falls under a lack of counterarguments—rather, there are plenty out there, just none that seem to have put thought into the matter.
At any rate, I’m not asking for the entire world to throw down their asteroid detection schemes or their climate mitigation strategies; that’s not politically feasible, regardless of risk probabilities. I’m just asking them to increase the size of the pie by a few million, maybe as little as one billion total, to add research about AI, and to spend more money on the whole gamut of existential risk reduction as a cohesive topic of great importance.
What would you tell the first climate scientist to examine global warming, or the first to predict asteroid strikes, other than “do more research, and get others to do research as well”?
What would you tell the first climate scientist to examine global warming, or the first to predict asteroid strikes, other than “do more research, and get others to do research as well”?
I have no problem with a billion dollars spend on friendly AI research. But that doesn’t mean that I agree that the SIAI needs a billion dollars right now or that I agree that the current evidence is enough to tell people to stop researching cancer therapies or create educational videos about basic algebra. I don’t think we know enough about risks from AI to justify such advice. I also don’t think that we should all become expected utility maximizer’s because we don’t know enough about economics, game theory, and decision theory and especially about human nature and the nature of discovery.
Why do I feel like there’s massively more evidence than “a few blog posts”?
Maybe because there is massively more evidence and I don’t know about it, don’t understand it, haven’t taken it into account or because I am simply biased. I am not saying that you are wrong and I am right.
...those on human history, and lumping it all under “what intelligent agents can accomplish”.
Shortly after human flight was invented we reached the moon. Yet human flight is not as sophisticated as bird or insect flight, it is much more inefficient, and we never reached other stars. Therefore, what I get out of this, shortly after we invent artificial general intelligence we might reach human-level intelligence and in some areas superhuman intelligence. But that doesn’t mean that it will be particularly fast or efficient, or that it will be able to take over the world shortly after. Artificial general intelligence is already an inference made from what we currently believe to be true, going a step further and drawing further inferences from previous speculations, e.g. explosive recursive self-improvement, is in my opinion a very shaky business. We have no idea about the nature of discovery, if intelligence (whatever that is) is even instrumental or quickly hits diminishing returns.
In principle we could build antimatter weapons capable of destroying worlds, but in practise it is much harder to accomplish. The same seems to be the case for intelligence. It is not intelligence in and of itself that allows humans to accomplish great feats. Someone like Einstein was lucky to be born into the right circumstances, the time was ripe for great discoveries.
Another large part of being convinced falls under a lack of counterarguments—rather, there are plenty out there, just none that seem to have put thought into the matter.
Prediction: The world is going to end.
Got any counterarguments I couldn’t easily dismiss?
Most of the superficially disjunctive lines of reasoning about risks from AI derive their appeal from their inherent vagueness. It’s not like you don’t need any assumptions to be true to get “artificial general intelligence that can undergo explosive recursive self-improvement to turn all matter in the universe into paperclips”. That’s a pretty complex prediction actually.
There are various different scenarios regarding the possibility and consequences of artificial general intelligence. I just don’t see why the one put forth by the SIAI is more likely to be true than others. Why for example would intelligence be a single principle that, once discovered, allows us to grow superhuman intelligence overnight? Why are we going to invent artificial general intelligence quickly, rather than having to painstakingly optimize our expert systems over many centuries? Why would intelligence be effectively applicable to intelligence itself, rather than demanding the discovery of unknown unknowns due to sheer luck or the pursuit of treatments for rare diseases in cute kittens? Why would general intelligence be at all efficient compared to expert systems, maybe general intelligence demands a tradeoff between plasticity and goal-stability? I can think of dozens of possibilities within minutes, none of them leading to existential risk scenarios.
I have no problem with a billion dollars spend on friendly AI research. But that doesn’t mean that I agree that the SIAI needs a billion dollars right now or that I agree that the current evidence is enough to tell people to stop researching cancer therapies or create educational videos about basic algebra. I don’t think we know enough about risks from AI to justify such advice. I also don’t think that we should all become expected utility maximizer’s because we don’t know enough about economics, game theory, and decision theory and especially about human nature and the nature of discovery.
This is the part I’d like to focus on. Restating that position from my understanding, you are unconvinced that SIAI is important to fund, and you will not pay them to convince you, and it would be perfectly fine for other people to fund them, and you will be following the area to see if they provide convincing things in the future. Is that a fair characterization?
...you are unconvinced that SIAI is important to fund, and you will not pay them to convince you, and it would be perfectly fine for other people to fund them, and you will be following the area to see if they provide convincing things in the future. Is that a fair characterization?
Almost, I think it is important that the SIAI continues to receive at least as much as it did last year. If the SIAI’s sustainability was at stake I would contribute money, I just don’t know how much. I would probably devote some time to think about the whole issue, more thoroughly than I have until now. Which also hints at a general problem, I think many people lack the initial incentive that is necessary to take the whole topic seriously in the first place, seriously enough to even invest the required time and resources to analyze the available data sufficiently.
I recently hinted at some problems that need to be addressed in order to convince me that the SIAI needs more money. I am currently waiting for the “exciting developments”, that have been mentioned in the subsequent comments thread, to take place.
Another problem is the secretive approach the SIAI seems to subscribe to. I am not convinced that a secretive approach is the right thing to do. I also don’t have enough confidence to just take their word for it if they say that they are making progress. They have to figure out how to convince people that actual progress is being made, or at least attempted, without revealing too much detail. They also have to explain if they suspect Eliezer Yudkowsky to be able to solve friendly AI on his own, or otherwise how they are going to guarantee the “friendliness” of future employees.
I think this thread started by timtyler is more representative of the opinion of most people (if they knew about the SIAI) than those members of lesswrong who are already sold. People here seem overly confident in what they are told without asking for further evidence. Not that I care about the AI box experiment, even prison guards can be persuaded by humans to let them out of the jail. But as timtyler said, the secretive approach employed by the SIAI, “don’t ask don’t tell”, isn’t going to convince many people any time soon. I doubt actual researchers would just trust the SIAI if they claimed they proved something without providing any evidence supporting the claim.
Shortly after human flight was invented we reached the moon. Yet human flight is not as sophisticated as bird or insect flight, it is much more inefficient, and we never reached other stars.
How do you mean? Human planes are faster and can transport freight better. They can even self-pilot with modern AI software. The biggest weaknesses would seem to be a lack of self-reproduction and self-repair, but those aren’t really part of flight.
How do you mean? Human planes are faster and can transport freight better.
Energy efficiency and maneuverability. I suppose a dragonfly would have been a better example. We never really went straight from no artificial flight towards generally superbird/insect flight. All we got are expert flight systems, no general flight systems. Even if we were handed the design for a perfect artificial dragonfly, minus the design for the flight of a dragonfly, we wouldn’t be able to build a dragonfly that can take over the world of dragonflies, all else equal, by means of superior flight characteristics.
Where are your figures for energy efficiency? (Recalling that the comparison should be for the same speed, or energy per kilogram transported for a kilometer given the optimal speed tradeoff).
A Harpy Eagle can lift more than three-quarters of its body weight while the Boeing 747 Large Cargo Freighter has a maximum take-off weight of almost double its operating empty weight. I suspect that insects can do better. But my whole point is that we never reached artificial flight that is strongly above the level of natural flight. An eagle can after all catch its cargo under various circumstances like the slope of a mountain or from under the surface of water, thanks to its superior maneuverability.
If there was a risk that might kill us with a probability of .7 and another risk with .1 while our chance to solve the first one was .0001 and the second one .1, which one should we focus on?
To solve this problem we need to know more. As it stands, the marginal effect of investment in the problems on the probability of the problems being solved is unknown—as is the temporal relationship of the problems. Do they arise at the same time? Is there going to be time to concentrate on the second problem after solving the first one? - etc.
I don’t trust my ability to make correct probability estimates, don’t trust the overall arguments and methods and don’t know how to integrate that uncertainty into my estimates. It is all too vague.
Essentially, uncertainty → wider confidence intervals and less certainty (i.e. fewer extreme probability estimates).
Given we survive long enough, we’ll find a way to write a self-modifying program that has, or can develop, human-level intelligence.
How can I arrive at the belief that it is possible for an algorithm to improve itself in a way to achieve something sufficiently similar to human-level intelligence? That it is in principle possible is not a question here. But is it possible given limited resources? And if it is possible given limited resources, is it efficient enough to pose an existential risk?
The capacity for self-modification follows from ‘artificial human intelligence,’ but since we’ve just seen links to writers ignoring that fact I thought I’d state it explicitly.
Humans can learn, that is far from what is necessary to reach a level above your own, on your own. Also, how do you know that any given level of intelligence is capable of handling its own complexity effectively? Many humans are not capable of handling the complexity of the brain of a worm.
This necessarily gives the AI the potential for greater-than-human intelligence due to our known flaws.
That humans have a hard time to change their flaws might be an actual feature, a trade off between plasticity, efficiency and the necessity of goal-stability.
Given A, the intelligence would improve itself to the point where we could no longer predict its actions in any detail.
I don’t think that is a reasonable assumption, see my post here. The short version: I don’t think that intelligence can be applied to itself efficiently.
...the AI could escape from any box we put it in. (IIRC this excludes certain forms of encryption, but I see no remotely credible scenario in which we sufficiently encrypt every self-modifying AI forever.)
Well, even humans can persuade their guards to let them out. I agree.
...the AI could wipe out humanity if it ‘wanted’ to do so.
I think it is unlikely that most AI designs will not hold. I agree with the argument that any AGI that isn’t made to care about humans won’t care about humans. But I also think that the same argument applies for spatio-temporal scope boundaries and resource limits. Even if the AGI is not told to hold, e.g. compute as many digits of Pi as possible, I consider it an far-fetched assumption that any AGI intrinsically cares to take over the universe as fast as possible to compute as many digits of Pi as possible. Sure, if all of that are presuppositions then it will happen, but I don’t see that most of all AGI designs are like that. Most that have the potential for superhuman intelligence, but who are given simple goals, will in my opinion just bob up and down as slowly as possible. This is an antiprediction, not a claim to the contrary. What makes you sure that it will be different?
Humans can learn, that is far from what is necessary to reach a level above your own, on your own.
Yes, you also need the ability to self-modify and the ability to take 20 or fail and keep going. But I just argued that the phrase “on your own” obscures the issue, because if one AGI has a chance to rewrite itself (and does not take over the world) then I see no realistic way to stop another from trying at some point.
Also, how do you know that any given level of intelligence is capable of handling its own complexity effectively?
I don’t think I need to talk about “any given level”. If humans maintain a civilization long enough (and I don’t necessarily accept Eliezer’s rough timetable here) we’ll understand our own level well enough to produce human-strength AGI directly or indirectly. By definition, the resulting AI will have at least a chance of understanding the process that produced it, given time. (When I try to think of an exception I find myself thinking of uploads, and perhaps byzantine programs that evolved inside computers. These might in theory fail to understand all but the human-designed parts of the process. But the second example seems unlikely on reflection, as it suggests vast amounts of wasted computation. Likewise—though I don’t know how much importance to attach to this—it seems to this layman as if biologists laugh at uploads and consider them a much harder problem than an AI with the power to program.Yet you’d need detailed knowledge of the brain’s biology to make an upload.) And of course it can think faster than we do in many areas (or if it can’t due to artificial restrictions, the next one can).
I don’t think that intelligence can be applied to itself efficiently.
You’ve established inefficiency as a logical possibility (in my judgement) but don’t seem to have given much argument for it. I count two sentences on your P2 that directly address the issue. And you have yet to engage with the cumulative probability argument. Note that a human-level AGI which can see problems or risks of self-modification may also see risk in avoiding it.
Even if the AGI is not told to hold, e.g. compute as many digits of Pi as possible,
If it literally has no other goals then it doesn’t sound like an AGI. The phrase “potential for superhuman intelligence” sounds like it refers to a part of the program that other people could (and, in my view, will) use to create a super-intelligence by combining it with more dangerous goals.
Hanson’s position being essentially destroyed by Hanson via support that made no sense...
As far as I can tell Hanson does not disagree with Yudkowsky except for the probability of risks from AI. Yudkowsky says that existential risks from AI are not under 5%. Has Yudkowsky been able to support this assertion sufficiently? Hanson only needs to show that it is unreasonable to assume that the probability is larger than 5% and my personal perception is that he was able to do so.
Note that my comment (quoted) referred to the 2008 debate, which was not on that subject.
Think I am a troll or an idiot, let me know, I want to know
You do not appear to be trolling in this particular instance.
He will probably start arguing for the correct position once it is supported enough as not to be destroyed by an incorrect position.
The main thing I can recall from from the 2008 debate mentioned was Hanson’s position being essentially destroyed by Hanson via support that made no sense, making Eliezer largely redundant.
As far as I can tell Hanson does not disagree with Yudkowsky except for the probability of risks from AI. Yudkowsky says that existential risks from AI are not under 5%. Has Yudkowsky been able to support this assertion sufficiently? Hanson only needs to show that it is unreasonable to assume that the probability is larger than 5% and my personal perception is that he was able to do so.
I have already posted various arguments for why I believe that the case for risks from AI, and especially recursive self-improvement (explosive), is not as strongly supported as some people seem to think. I haven’t come across a good refutation, or a strong argument to the contrary.
You are of course free to suspect and declare that I was already given sufficient evidence and that my motives are not allowing me to admit that I am wrong. I won’t perceive anything like that as a personal attack and ask everyone not to downvote you for any such statements. Think I am a troll or an idiot, let me know, I want to know :-)
The debate seems to be about the probability of the specific “AI in a basement goes foom” scenario. There are surely other existential risk scenarios which seem more Hansonian, e.g. firms gradually replacing all of their human capital with computers and robots until there are no humans left?
Hanson’s arguments here don’t apply to nation-states that start off with a large portion of world research resources and stronger ability to keep secrets, as we saw with nuclear weapons. He has a quite separate argument against that, which is that governments are too stupid to notice early brain emulation or AI technology and recognize that it is about to turn the world upside down.
What: even the NSA and IARPA—whose job it is?
I don’t buy it at that level of confidence. Robin says the Manhattan Project was an anomaly in wartime, and that past efforts to restrict the spread of technologies like encryption and supercomputers didn’t work for long (I’d say the cost-benefit for secrecy here was much, much worse than for human-level AI/WBE). My reply is that a delay of 4 years, like that between the US and Soviet nuclear tests, would be a long, long time for WBE or human-level AI to drive a local intelligence explosion. Software is easier to steal, but even so.
Your reply is focused on keeping secrets. I meant my comment to apply to the second claim—the one about governments being “too stupid”. That claim might be right—but it is not obvious. Government departments focused on this sort of thing (of which there are several) will understand—and no doubt already understand. The issue is more whether the communication lines are free, whether the top military brass take thier own boffins seriously—and whether they go on to get approval from head office.
As for secrecy—the NSA has a long history of extreme secrecy. The main reason most people don’t know about their secret tech projects is because their secrecy is so good. If they develop a superintelligence, I figure it will be a secret one that will probably remain chained up in their basement. They are the main reason, my graph has some probability mass already.
A while back I posted this minimalist account of Eliezer’s case for the importance of FAI to human survival. (Claim B technically seems too specific if you want to talk about the existential risk as a whole, but I think it reflects his view.)
So far I can’t tell if you agree that each claim has easily more than .5 probability given the evidence, nor if you share my view that Claim A as separate from the rest has P close to 1. In particular, you said here that you believe:
By the same principle, the speed or slowness of FOOM doesn’t matter in the long run unless some force with the power to stop it does so, and unless this happens every single time someone creates an unFriendly AI with the power to self-modify. I have almost no confidence that humanity in general will learn from past mistakes (and precious little confidence in the subset that could write the second or third AGI). So I think we need to look at the cumulative chance for Claim B, Claim C, and perhaps even D.
Even so, it seems possible that the actual risk stays within 5%. Maybe you think some form of FAI, such as Friendly uploads, will prove easy once we get the capacity for some form of AGI. Maybe you think we seem likely to kill ourselves with X before then. Maybe you think some other force(s) will stop each and every AGI. If so, I’d like to hear your reasoning.
And if not, if you want to argue against my claims in some other way, please do so without identifying them with a more specific storyline.
ETA: I apparently forgot how to use links. I believe this means I should go eat or sleep. Take that as you will.
The whole dispute is about your claim A. It gives lot of credence to Y’s idea of where things are headed (someone is going to write a single AI that takes over the world) and none to H’s (someone is going to upload some humans and make trillions of copies). Those are two very different possibilities with different consequences, and there’s no reason to believe it’s close to an exhaustive list of plausible scenarios.
Actually, I wrote Claim A in such a way that it applies to uploads. Near as I can tell H disagrees with B and/or D, or else he believes that FAI will ‘emerge’ naturally (e.g. from interactions between uploads).
Well, if we understand A to include ems, then your B, C and D are misleadingly worded. They all speak of a single AI. If you reworded them to allow for ems, then I guess I would agree that the dispute is located there. But I would be interested to see such a rewording—I think it would be favorable to Hanson’s point of view.
It’s also not clear to me that “single powerful AI or trillions of uploads” together add up to p > .5! But that might be off-topic.
Favorable? I don’t know why you’d think that. Seems to me the charitable interpretation of Hanson’s view has him thinking of ems as naturally Friendly, or near-Friendly. (My analysis didn’t mention the chance of us getting FAI without working for it.)
If we get two unFriendly AIs that individually have the power to kill humanity, and if acting quickly means they don’t have to negotiate with anyone else from this planet, they’ll divide Earth between them. If we somehow get trillions of uFAIs with practically different goals, then of course the expected value of killing humanity goes way down. But it still sounds greater than the expected value of cooperating with us, by Hanson’s analysis. And if we get one FAI out of a trillion AGIs, I think that leads to either war or a compromise like the one the Super-Happies offered the Babyeaters. We might get a one-trillionth slice of the available matter, with no more thought given to (say) the aesthetics of the Moon than we give to any random person who’d like to see his name there in green neon every night. (Still a better deal than we’d offer any one upload according to Hanson. But maybe I’ve misunderstood him?)
I also don’t understand how you get that much memory and processing power without some designer that seems awfully close to an AI-programming AI. But as a layman I may be thinking of that in the wrong way.
Oh, and I lean towards P(genocide) somewhat under .4 without FAI theory. Right now I’m just arguing that it exceeds .05 per XiXiDu’s comment. You may have misread his “5%” there.
You would have to tell me what friendly and unfriendly means in this context. Hanson expects ems to be very numerous and very poor. I doubt he expects any one of them to have the resources available to what’s usually called an fai. Is a human being running at human speeds F or UF?
I don’t think the notion of “cooperating with us” is coherent. Just as the trillions of ems might have practically different goals, so might the billions of live humans.
Possibly, being poor, they would not have that much memory and processing power.
Taking the last part first for context: this layman thinks that just simulating a conscious brain (experiencing something other than pure terror or slow insanity) would take a lot of resources using the copy-an-airplane-with-bullet-holes approach where you don’t know what the parts actually do, at least not well enough to make a self-reflective programming AI from scratch.
As to the rest, I’m assuming my previous claims hold for the case of a single AGI because you seemed to argue that simply introducing a lot more AGIs changes the argument. (“Cooperating with us” therefore means not killing us all.) I started out by granting that the nature of the AIs could make a big difference. The number seems almost irrelevant. It seems like you’re arguing for the possibility that no single em would have enough resources to produce super-intelligence (making assumptions about what that requires), since they might find themselves sharing a medium with trillions of competing ems before they get that far. But this appears to mean that giving more resources to any one of them (or any group with consistent goals) could easily produce super-intelligence. Someone would eventually do this. Indeed, Hanson seems to argue that a workforce of ems would help produce better technology and thus better ems.
I do have to address the possibility that the normal ems themselves could stop a self-modifying AI because they would think faster than ordinary humans. That situation would certainly decrease the risk of killing humanity. But again, for that to make sense you have to assume that effective self-modification requires vast resources (or you just get trillions of self-modifiers by assumption). You may also need to assume that a super-intelligence needs even more resources to work out a plan for killing us—otherwise the rest of the ems would seemingly have no way to discover the plan before it went into motion, except by chance. (A superior intelligence would try to include their later actions in the plan.) Note that even these assumptions do not yield near-certainty of survival given uFAI, not with the observed stupidity of humanity. Seems like you’d at least need the additional assumption that no biological human who the uFAI can reach and fool has the power to trigger our demise.
And then of course we have the relative difficulty of emulation and new designs for reflective AI. It took no time at all to find someone arguing for the impossibility of the former, on the grounds that normal emulation requires knowing what the original does well enough to copy one part and not another. If we get that knowledge it increases the likelihood of new AGI—indeed, it almost seems to require making ‘narrow AIs’ along the way, and by assumption this happens before we know what each one can do.
My main reason for doubting this part, however, lies in the fact that it suggests we can avoid otherwise difficult work, and the underlying belief seems to have grown in popularity along with our grasp of said difficulty.
That’s one of the main problems I have with the whole existential risks prediction business. There is a specific storyline, it is comprised in the vagueness of your claims. If you tried to pin down a concept like ‘recursive self-improvement’ that supports the notion of an existential risk, you would end up with an argument that is strongly conjunctive. Most of the arguments in favor of risks from AI derive their appeal from vagueness, that doesn’t mean that they are disjunctive.
Yes, I said that I believe that even sub-human level AI pose an existential risk. At the same time I am highly skeptical of FOOM. So why don’t I agree with Eliezer outright anyway? Because the risks from AI that I perceive to be a possibility are not something you can solve by inventing provable “friendliness”. How are you going to make a sophisticated monitoring system friendly? Why would people want to make it friendly? How are you going to make a virus with sub-human level general intelligence friendly? Why would one do that? Risks from AI are a broad category that need meta-solutions that involve preemptive political and security measures. You need to make sure that the first intelligent surveillance systems are employed transparently and democratically so that everyone can monitor the world for the various risks ahead. We need a global immune system that keeps care that nowhere someone gets ahead of everyone else.
Have you taken your own survey and published the results somewhere? Or is it only for AI researchers? It seems like there are a great deal of hidden assumptions on all sides which make these discussions go off the tracks very quickly. Some kind of basic survey with standard probability estimates might easily show where views differ.
I agree that friendliness is a long shot. If you know of a better solution, please let me know.
By developing a theory of friendliness and implementing it in software.
Because unfriendly things are bad.
By developing a theory of friendliness and implementing it in software.
Because unfriendly things are bad.
Sounds like a job for CEV and a friendly AI to run it on.
Yes I have done so. But I don’t trust my ability to make correct probability estimates, don’t trust the overall arguments and methods and don’t know how to integrate that uncertainty into my estimates. It is all too vague.
There sure are a lot of convincing arguments in favor of risks from AI. But do arguments suffice? Nobody is an expert when it comes to intelligence. Even worse, I don’t think anybody knows much about artificial general intelligence.
My problem is that I fear that some convincing blog posts are simply not enough. Just imagine all there was to climate change was someone with a blog who never studied the climate but instead wrote some essays about how it might be physical possible for humans to cause a global warming. Not enough, the same person then goes on to make further inferences based on the implications of those speculations. Am I going to tell everyone to stop emitting CO2 because of that? Hardly! Or imagine that all there was to the possibility of asteroid strikes was someone who argued that there might be big chunks of rocks out there which might fall down on our heads and kill us all, inductively based on the fact that the Earth and the moon are also a big rocks. Would I be willing to launch a billion dollar asteroid deflection program solely based on such speculations? I don’t think so. Luckily, in both cases, we got a lot more than some convincing arguments in support of those risks.
Another example: If there were no studies about the safety of high energy physics experiments then I might assign a 20% chance of a powerful particle accelerator destroying the universe based on some convincing arguments put forth on a blog by someone who never studied high energy physics. We know that such an estimate would be wrong by many orders of magnitude. Yet the reason for being wrong would largely be a result of my inability to make correct probability estimates, the result of vagueness or a failure of the methods I employed. The reason for being wrong by many orders of magnitude would have nothing to do with the arguments in favor of the risks, as they might very well be sound given my epistemic sate and the prevalent uncertainty.
In summary: I believe that mere arguments in favor of one risk do not suffice to neglect other risks that are supported by other kinds of evidence. I believe that logical implications of sound arguments should not reach out indefinitely and thereby outweigh other risks whose implications are fortified by empirical evidence. Sound arguments, predictions, speculations and their logical implications are enough to demand further attention and research, but not much more.
If there was a risk that might kill us with a probability of .7 and another risk with .1 while our chance to solve the first one was .0001 and the second one .1, which one should we focus on?
Why do I feel like there’s massively more evidence than “a few blog posts”? I must be counting information I’ve gained from other studies, like those on human history, and lumping it all under “what intelligent agents can accomplish”. I’m likely counting fictional evidence, as well; I feel sort of like an early 20th century sci-fi buff must have felt about rockets to the moon. Another large part of being convinced falls under a lack of counterarguments—rather, there are plenty out there, just none that seem to have put thought into the matter.
At any rate, I’m not asking for the entire world to throw down their asteroid detection schemes or their climate mitigation strategies; that’s not politically feasible, regardless of risk probabilities. I’m just asking them to increase the size of the pie by a few million, maybe as little as one billion total, to add research about AI, and to spend more money on the whole gamut of existential risk reduction as a cohesive topic of great importance.
What would you tell the first climate scientist to examine global warming, or the first to predict asteroid strikes, other than “do more research, and get others to do research as well”?
I have no problem with a billion dollars spend on friendly AI research. But that doesn’t mean that I agree that the SIAI needs a billion dollars right now or that I agree that the current evidence is enough to tell people to stop researching cancer therapies or create educational videos about basic algebra. I don’t think we know enough about risks from AI to justify such advice. I also don’t think that we should all become expected utility maximizer’s because we don’t know enough about economics, game theory, and decision theory and especially about human nature and the nature of discovery.
Maybe because there is massively more evidence and I don’t know about it, don’t understand it, haven’t taken it into account or because I am simply biased. I am not saying that you are wrong and I am right.
Shortly after human flight was invented we reached the moon. Yet human flight is not as sophisticated as bird or insect flight, it is much more inefficient, and we never reached other stars. Therefore, what I get out of this, shortly after we invent artificial general intelligence we might reach human-level intelligence and in some areas superhuman intelligence. But that doesn’t mean that it will be particularly fast or efficient, or that it will be able to take over the world shortly after. Artificial general intelligence is already an inference made from what we currently believe to be true, going a step further and drawing further inferences from previous speculations, e.g. explosive recursive self-improvement, is in my opinion a very shaky business. We have no idea about the nature of discovery, if intelligence (whatever that is) is even instrumental or quickly hits diminishing returns.
In principle we could build antimatter weapons capable of destroying worlds, but in practise it is much harder to accomplish. The same seems to be the case for intelligence. It is not intelligence in and of itself that allows humans to accomplish great feats. Someone like Einstein was lucky to be born into the right circumstances, the time was ripe for great discoveries.
Prediction: The world is going to end.
Got any counterarguments I couldn’t easily dismiss?
Most of the superficially disjunctive lines of reasoning about risks from AI derive their appeal from their inherent vagueness. It’s not like you don’t need any assumptions to be true to get “artificial general intelligence that can undergo explosive recursive self-improvement to turn all matter in the universe into paperclips”. That’s a pretty complex prediction actually.
There are various different scenarios regarding the possibility and consequences of artificial general intelligence. I just don’t see why the one put forth by the SIAI is more likely to be true than others. Why for example would intelligence be a single principle that, once discovered, allows us to grow superhuman intelligence overnight? Why are we going to invent artificial general intelligence quickly, rather than having to painstakingly optimize our expert systems over many centuries? Why would intelligence be effectively applicable to intelligence itself, rather than demanding the discovery of unknown unknowns due to sheer luck or the pursuit of treatments for rare diseases in cute kittens? Why would general intelligence be at all efficient compared to expert systems, maybe general intelligence demands a tradeoff between plasticity and goal-stability? I can think of dozens of possibilities within minutes, none of them leading to existential risk scenarios.
This is the part I’d like to focus on. Restating that position from my understanding, you are unconvinced that SIAI is important to fund, and you will not pay them to convince you, and it would be perfectly fine for other people to fund them, and you will be following the area to see if they provide convincing things in the future. Is that a fair characterization?
Almost, I think it is important that the SIAI continues to receive at least as much as it did last year. If the SIAI’s sustainability was at stake I would contribute money, I just don’t know how much. I would probably devote some time to think about the whole issue, more thoroughly than I have until now. Which also hints at a general problem, I think many people lack the initial incentive that is necessary to take the whole topic seriously in the first place, seriously enough to even invest the required time and resources to analyze the available data sufficiently.
I recently hinted at some problems that need to be addressed in order to convince me that the SIAI needs more money. I am currently waiting for the “exciting developments”, that have been mentioned in the subsequent comments thread, to take place.
Another problem is the secretive approach the SIAI seems to subscribe to. I am not convinced that a secretive approach is the right thing to do. I also don’t have enough confidence to just take their word for it if they say that they are making progress. They have to figure out how to convince people that actual progress is being made, or at least attempted, without revealing too much detail. They also have to explain if they suspect Eliezer Yudkowsky to be able to solve friendly AI on his own, or otherwise how they are going to guarantee the “friendliness” of future employees.
I think this thread started by timtyler is more representative of the opinion of most people (if they knew about the SIAI) than those members of lesswrong who are already sold. People here seem overly confident in what they are told without asking for further evidence. Not that I care about the AI box experiment, even prison guards can be persuaded by humans to let them out of the jail. But as timtyler said, the secretive approach employed by the SIAI, “don’t ask don’t tell”, isn’t going to convince many people any time soon. I doubt actual researchers would just trust the SIAI if they claimed they proved something without providing any evidence supporting the claim.
How do you mean? Human planes are faster and can transport freight better. They can even self-pilot with modern AI software. The biggest weaknesses would seem to be a lack of self-reproduction and self-repair, but those aren’t really part of flight.
Energy efficiency and maneuverability. I suppose a dragonfly would have been a better example. We never really went straight from no artificial flight towards generally superbird/insect flight. All we got are expert flight systems, no general flight systems. Even if we were handed the design for a perfect artificial dragonfly, minus the design for the flight of a dragonfly, we wouldn’t be able to build a dragonfly that can take over the world of dragonflies, all else equal, by means of superior flight characteristics.
Where are your figures for energy efficiency? (Recalling that the comparison should be for the same speed, or energy per kilogram transported for a kilometer given the optimal speed tradeoff).
A Harpy Eagle can lift more than three-quarters of its body weight while the Boeing 747 Large Cargo Freighter has a maximum take-off weight of almost double its operating empty weight. I suspect that insects can do better. But my whole point is that we never reached artificial flight that is strongly above the level of natural flight. An eagle can after all catch its cargo under various circumstances like the slope of a mountain or from under the surface of water, thanks to its superior maneuverability.
To solve this problem we need to know more. As it stands, the marginal effect of investment in the problems on the probability of the problems being solved is unknown—as is the temporal relationship of the problems. Do they arise at the same time? Is there going to be time to concentrate on the second problem after solving the first one? - etc.
Essentially, uncertainty → wider confidence intervals and less certainty (i.e. fewer extreme probability estimates).
Please convince me that your Roboto Protocol could work. I don’t observe politics ever producing results like the ones you seem to require.
How can I arrive at the belief that it is possible for an algorithm to improve itself in a way to achieve something sufficiently similar to human-level intelligence? That it is in principle possible is not a question here. But is it possible given limited resources? And if it is possible given limited resources, is it efficient enough to pose an existential risk?
Humans can learn, that is far from what is necessary to reach a level above your own, on your own. Also, how do you know that any given level of intelligence is capable of handling its own complexity effectively? Many humans are not capable of handling the complexity of the brain of a worm.
That humans have a hard time to change their flaws might be an actual feature, a trade off between plasticity, efficiency and the necessity of goal-stability.
I don’t think that is a reasonable assumption, see my post here. The short version: I don’t think that intelligence can be applied to itself efficiently.
Well, even humans can persuade their guards to let them out. I agree.
I think it is unlikely that most AI designs will not hold. I agree with the argument that any AGI that isn’t made to care about humans won’t care about humans. But I also think that the same argument applies for spatio-temporal scope boundaries and resource limits. Even if the AGI is not told to hold, e.g. compute as many digits of Pi as possible, I consider it an far-fetched assumption that any AGI intrinsically cares to take over the universe as fast as possible to compute as many digits of Pi as possible. Sure, if all of that are presuppositions then it will happen, but I don’t see that most of all AGI designs are like that. Most that have the potential for superhuman intelligence, but who are given simple goals, will in my opinion just bob up and down as slowly as possible. This is an antiprediction, not a claim to the contrary. What makes you sure that it will be different?
Yes, you also need the ability to self-modify and the ability to take 20 or fail and keep going. But I just argued that the phrase “on your own” obscures the issue, because if one AGI has a chance to rewrite itself (and does not take over the world) then I see no realistic way to stop another from trying at some point.
I don’t think I need to talk about “any given level”. If humans maintain a civilization long enough (and I don’t necessarily accept Eliezer’s rough timetable here) we’ll understand our own level well enough to produce human-strength AGI directly or indirectly. By definition, the resulting AI will have at least a chance of understanding the process that produced it, given time. (When I try to think of an exception I find myself thinking of uploads, and perhaps byzantine programs that evolved inside computers. These might in theory fail to understand all but the human-designed parts of the process. But the second example seems unlikely on reflection, as it suggests vast amounts of wasted computation. Likewise—though I don’t know how much importance to attach to this—it seems to this layman as if biologists laugh at uploads and consider them a much harder problem than an AI with the power to program.Yet you’d need detailed knowledge of the brain’s biology to make an upload.) And of course it can think faster than we do in many areas (or if it can’t due to artificial restrictions, the next one can).
You’ve established inefficiency as a logical possibility (in my judgement) but don’t seem to have given much argument for it. I count two sentences on your P2 that directly address the issue. And you have yet to engage with the cumulative probability argument. Note that a human-level AGI which can see problems or risks of self-modification may also see risk in avoiding it.
If it literally has no other goals then it doesn’t sound like an AGI. The phrase “potential for superhuman intelligence” sounds like it refers to a part of the program that other people could (and, in my view, will) use to create a super-intelligence by combining it with more dangerous goals.
Note that my comment (quoted) referred to the 2008 debate, which was not on that subject.
You do not appear to be trolling in this particular instance.