Snowdenizing UFAI
Here is a suggestion for slowing down future secretive and unsafe UFAI projects.
Take the American defense and intelligence community as a case in point. They are a top candidate for the creation of Artificial General Intelligence (AGI): They can get the massive funding, and they can get some top (or near-top) brains on the job. The AGI will be unfriendly, unless friendliness is a primary goal from the start.
The American defense and intelligence community created the Manhattan Project, which is the canonical example for a giant, secret, leading-edge science-technology project with existential-risk implications.
David Chalmers (2010): “When I discussed [AI existential risk] with cadets and staff at the West Point Military Academy, the question arose as to whether the US military or other branches of the government might attempt to prevent the creation of AI or AI+, due to the risks of an intelligence explosion. The consensus was that they would not, as such prevention would only increase the chances that AI or AI+ would first be created by a foreign power.”
Edward Snowden broke the intelligence community’s norms by reporting what he saw to be tremendous ethical and legal violations. This requires an exceptionally well-developed personal sense of ethics (even if you disagree with those ethics). His actions have drawn a lot of support by those who share his values. Many who condemn him a traitor are still criticizing government intrusions in the basis of his revelations.
When the government AGI project starts rolling, will it have Snowdens who can warn internally about Unfriendly AI (UFAI) risks? They will probably be ignored and suppressed—that’s how it goes in hierarchical bureaucratic organizations. Will these future Snowdens have the courage to keep fighting internally, and eventually to report the risks to the public or to their allies in the Friendly AI (FAI) research community
Naturally, the Snowden scenario is not limited to the US government. We can seek ethical dissidents, truthtellers, and whistleblowers in any large and powerful organization that does unsafe research, whether a government or a corporation.
Should we start preparing budding AGI researchers to think this way? We can do this by encouraging people to take consequentialist ethics seriously, which by itself can lead to Snowden-like results. and LessWrong is certainly working on that. But another approach is to start talking more directly about the “UFAI Whistleblower Pledge.”
I hereby promise to fight unsafe AGI development in whatever way I can, through internal channels in my organization, by working with outside allies, or even by revealing the risks to the public.
If this concept becomes widespread, and all the more so if people sign on, the threat of ethical whistleblowing will hover over every unsafe AGI project. Even with all the oaths and threats they use to make new employees keep secrets, the notion that speaking out on UFAI is deep in the consensus of serious AGI developers will cast a shadow on every project.
To be clear, the beneficial effect I am talking about here is not the leaks—it is the atmosphere of potential leaks, the lack of trust by management that researchers are completely committed to keeping any secret. For example, post Snowden, the intelligence agencies are requiring that sensitive files only be accessed by two people working together and they are probably tightening their approval guidelines and so rejecting otherwise suitable candidates. These changes make everything more cumbersome.
In creating the OpenCog project, Ben Goertzel advocated total openness as a way of accelerating the progress of those who are willing to expose any dangerous work they might be doing—even if this means that the safer researchers are giving their ideas to the unsafe, secretive ones.
On the other hand, Eliezer Yudkowsky has suggested that MIRI keep its AGI implementation ideas secret, to avoid handing them to an unsafe project. (See “Evaluating the Feasibility of SI’s Plans,” and, if you can stomach some argument from fictional evidence, “Three Worlds Collide.”) Encouraging openness and leaks could endanger Eliezer’s strategy. But if we follow Eliezer’s position, a truly ethical consequentialist would understand that exposing unsafe projects is good, while exposing safer projects is bad.
So, what do you think? Should we start signing as many current and upcoming AGI researchers as possible to the UFAI Whistleblower Pledge, or work to make this an ethical norm in the community?
Organizations concerned about future Snowdens will be less likely to hire someone who takes such a pledge. Indeed, as I would expect these organizations to be putting in place mechanisms to identify future Snowdens I would expect them to be the biggest supporters of getting lots of people to signal their likelihood of becoming another Snowden.
Instead, how about people (such as myself) who will never have the technical skills to help create an AGI take a pledge that we we provide financial support to anyone who suffers great personal loss because he exposed AGI development risks.
Your commitment idea is good.
Eliezer’s view, as I understand it, is that only the smartest of the smart have a chance of creating AGI for now.
If these outliers declare their principles, then it will exclude dangerous AGI projects from getting researchers who can make it work (until Moore’s Law of Mad Science has its effect). And perhaps lesser minds will follow these leaders in choosing the norms for their community.
If we compare to the canonical example, the Manhattan Project, we see that the smartest people did not refuse to join. World War II is on, and their viewpoint is at least understandable A few passed info to the Soviet Union—I don’t know if we can analogize to that. A few people left the project (Rothblatt) or turned into anti-nuclear crusaders afterward. A few leading minds just continued developing newer and nastier bombs.
But Szilard understood the tremendous danger from A-bombs before they were built. Einstein was a pacifist, and many physicists has strong ethical convictions. If they had all spoken loudly and publicly about the evils of A-bombs, and made that a norm in their community in the 1930s, that might have slowed down the Manhattan Project.
The Manhattan project was a strategic failure because it greatly helped the Soviets build atomic weapons. The U.S. military would have been far better off if it more carefully chose who could work on nuclear weapons development, even if this added several years to how long it took them to get atomic weapons.
So, analogizing to a future AGI project, you’re saying that having more “ideologically incorrect” people in the research community can indeed harm a potentially dangerous project.
If the leaders of the unsafe project exclude more “ideologically incorrect” ppeople, then this will add to the time required for development.
On the other hand, if there are more people with the “incorrect” leak-prone and these are not excluded (possibly because they never made their ideology public), then a potentially beneficial leak is more likely.
Yes
The role of espionage in the Soviet nuclear weapons program has been greatly exaggerated. While spies did accelerate their progress, it’s pretty clear that they could have developed nuclear weapons entirely on their own. I don’t think things would have been vastly different if the only information they had was that the US dropped a nuclear weapon on Hiroshima and Nagasaki.
My guess is that the most likely result would be that (1) by no means all AI researchers would take the pledge (because such things never get universally adopted), and then (2) the government would preferentially hire people who hadn’t taken it or who would credibly undertake to break it in the name of National Security, and then (3) the first-order effect of the pledge would be to reduce the fraction of people in any government AI effort who are at all concerned about Friendliness.
That doesn’t seem to be a win.
I suppose that if you could get it near-universally adopted it might work, but that would require a degree of buy-in I see no way of getting.
Yes, this is the key question. But if we grant that current AI work is not likely to lead to AGI, and that a new generation of research and researchers is needed, then perhaps this sort of pledge can be the norm.
It doesn’t need to be a formal pledge, just a well-known point of view. If researchers are known to be part of the MIRI circle, that is already halfway to rendering them “suspect” of having a strong safety concern.
I’m glad the US gov angle is getting more discussion in LessWrong, but I question this part:
Seems often not true. The team on the Manhattan Project, for example, rather thoroughly investigated the possibility that the a nuclear test could “ignite” (set off a nuclear chain reaction in) the atmosphere before proceeding. Also, it’s one thing for people in a bureaucracy to ignore problems that will only affect other people, but I’d expect them to be more careful about problems that could get themselves killed.
Indeed, one of my major fears with any AI project is that those involved would be very careful about checking that activating the AI would go well for them, less careful about checking that it would go well for other people, especially people they don’t consider part of their “in group,” whatever that is.
Suppose that they had calculated a probability X of igniting the atmosphere. How large do you think would X have to be for the government to agree to halt the project?
Relying on the ethics or the sanity of the project’s scientists who would need to act in concert against the project’s administrators and owners does not seem feasible.
Probably between 1 and 10% if you put it that way, which is of course insanely high. But that’s not how I’d actually expect it to be framed. If it had come up as an issue, I would have expected it to go something like this:
Leslie Groves: I hear there’s a holdup.
J. Robert Oppenheimer: Yeah, about that. We’ve calculated that there’s a chance the device will initiate a chain reaction in the atmosphere, killing everyone on the planet and melting most of its crust. Consensus among the boys is this falls outside operational requirements.
Groves: How big of a chance?
Oppenheimer: We don’t know exactly. It’s a theoretical issue, and jargon jargon error bars jargon technobabble.
Groves: The Jerries won’t be waiting for us. I need you to hurry this up.
Oppenheimer: We don’t want to proceed with a full-scale test until we’ve ruled this out, but we can work out the theory in parallel. I’ll put that young fellow from the computation group on it.
Groves: Don’t disappoint me.
I’m not sure what you mean, or why you say this.
Suppose the young fellow working in parallel comes back and says it’s 0.95% to the best of everyone’s knowledge. You say that you’d expect the government to proceed with the test and overrule any project members who disagreed. And if they protested further, they’d be treated like other dissenters during wartime and would be at least removed from the project.
To put it mildly, I’d rather that governments not accept a 0.95% chance of destroying all life on Earth in return for an advantage in a weapons race.
You estimate the government might press ahead even with 9% probability of extinction. If every competing government takes on a different risk of this magnitude—perhaps a risk of their own personal failure that is really independent of competitors, as with the risk of releasing an AI that turns out to be Unfriendly—then with 10 such projects we have 90% total probability of the extinction of all life.
I mean that the military administration of the Manhattan Project wasn’t actually equipped to deal with existential risk calculations, the scientific side of the project would have known this, and the administrative side would have known they’d known. It’s effectively a technical obstacle and would have been dealt with as such.
In actuality, that question resolved itself when further investigation showed that it wasn’t going to be a problem. But if the answer had been “no, we’ve done the math and we think it’s too risky”, I think that would have been accepted too (though probably not immediately or without resistance). I don’t think that flat percentages would at any stage have been offered for the interpretation of people not competent to interpret them.
Um, hypothetically, once the first SIAI is released (Friendly or not) it isn’t going to give the next group a go.
Only the odds on the first one to be released matter, so they can’t multiply to a 90% risk.
With that said, you’re right that it would be a good thing for governments to take existential risks seriously, just like it would be a good thing for pretty much everyone to take them seriously, ya?
Good! If the hypothetical future government project has a strong concern for AGI safety, then we’re doing better than I expected already!
I’m not getting the impression that the actual Snowden’s actions are going to succeed in stopping universal surveillance.
Equally, if someone leaks information about an unsafe AI being developed by a superpower (or by a cabal of many countries, like the other intelligence services that cooperate with the NSA), that would likely only make the backing governments try to speed up the project (now that it’s public, it might have lost some of its lead in a first-take-all game) and to hide it better.
If someone is smart enough to build the first ever AGI, and they don’t care about Friendliness, are they likely to have any goals other than literally taking over the world? And are they going to care about public opinion, or a few commentators condemning their ethics?
So while all of this is probably a step in the right direction and worthwhile, I don’t think it would be nearly enough of a safeguard.
In this posting, I am not focusing on the value of a leak.
I am focusing on the value of making the project leaders suspicious that all the smart people are potentially against them.
To compare to the Snowden story: Hacker culture is anti-authoritarian. Yet the NSA has relied on the fact that there are good tech people who are willing to join a government agency. Now, the NSA is probably wondering how widespread the “ideologically incorrect” hacker ethos really is in their organization.
Comparing it to AGI: What if (1) only a handful of people were good enough to make AGI progress and (2) an anti-authoritarian ideology were known to be widespread among such people?
Actually, I wonder how NSA and the like have managed to get the best cryptographers, hackers, and other math/creative people to work there. To be sure, there are some fine math PhDs and the like who are glad to sign on, but if it is true that the smartest people, like Turing or Feynman, are irreverent rule-breakers, then does the NSA recruit these people?
(I asked the question on Quora, but I am not quite satisfied with the answer.)
World War II created a situation where even the rule-breaker types were willing to join the fight against Hitler.
Sure, I am not suggesting this as an adequate safeguard on its own.
Reportedly (going for the top Google result) there are 5 million people with high security clearance in the US, including 500,000 outside contractors like Snowden was. And yet in the last two decades there have been very few ethics-driven leaks (10? 20?) and none of them were on the same scale as Snowden’s. And that was before Snowden and the current crackdown on whistleblowing in the US; I expect the rate of whistleblowing/leaking to go down over time, not up.
This is strong evidence that defectors who leak data are extremely rare. You can never eliminate all leaks among millions of people, but so far the government has accomplished much more than I would have expected. Project leaders should not worry unduly.
Snowden’s leaks were apparently driven by specific ethics, not general hacker anti-authoritarianism. Whistleblowing designed to expose illegal or unethical conduct is probably correlated with anti-authoritarianism but they’re not the same.
I’m not convinced that it is true that intelligence is correlated with rule-breaking or anti-authoritorianism. What’s your evidence, aside from anecdotes of individuals like Turing and Feynman?
Maybe they shouldn’t worry, but they always do.
Note the new rules imposed since the Snowden leaks, such as the two-man rule for accessing sensitive files, and a stronger concern for vetting (which inevitably slows down recruitment and excludes some good people).
In general, bureaucracies always respond to scandals with excessive new rules that slow down all work for everyone.
All it takes is a reasonable probability of one leak, and project leaders get uptight.
It’s a good question, and other than a general impression and a wide variety of slogans (“think out of the box,” “Be an individual” etc), I don’t have any evidence.
But that hasn’t stopped them. Indeed, similar programs are only picking up.
Not that I particularly object to any attempt to raise awareness about this issue, quite the opposite in fact. This objection is based on your own analogy.
You realise, of course, that achieving this precise effect was a lot of the point of Wikileaks.
*(corrected per gjm below)
Either you copied the wrong thing into that quotation, or I am disastrously failing to understand your point.
(For the benefit of future readers in case it was indeed a mistake and David fixes it: At the moment his comment begins with the following in a blockquote: “The current official image of Bernice Summerfield, as used on the Bernice Summerfield Inside Story book, published October 2007”. Bernice Summerfield is a character in Doctor Who and not, so far as I know, in any way associated with the NSA or AGI or mentioned anywhere else in this thread.)
I’m guessing that the quoted material was actually meant to be “I am focusing on the value of making the project leaders suspicious that all the smart people are potentially against them”.
You are of course quite correct. I have edited my post. Thank you :-)
I am getting the impression that they were a good start, probably the best that he—a single guy—could have done. Certainly many more people are aware of it and many of them are pissed. Big companies are unhappy too.
It’s certainly a great deal for one person to have accomplished (and most of the data he leaked hasn’t been released yet). Nevertheless, we don’t yet know if government surveillance and secrecy will be reduced as a result.
This is a pretty much impossible criterion to satisfy.
Just as with AGI defectors, what you get might not be ideal or proper or satisfying or even sufficient—but that’s what you got. Working with that is much preferable to waiting for perfection.
The criterion of “will surveillance and secrecy be reduced as a result” is the only relevant one. Naturally we can’t know results in advance for certain, and that means we can’t grade actions like Snowden’s in advance either. We do the best we can, but it’s still legitimate to point out that the true impact of Snowden’s actions is not yet known, when the discussion is about how much you expect to benefit from similar actions taken by others in the future.
Keep in mind that leakers will be a problem for FAI projects as well.
What good would a Snowden do? The research would continue.
Yes, but fear of a Snowden would make project leaders distrustful of their own staff.
And if many top researchers in the field were known to be publicly opposed to any unsafe project that the agencies are likely to create, it would shrink their recruiting pool.
The idea is to create a moral norm in the community. The norm can be violated, but it would put a crimp in the projects as compared to a situation where there is no such moral norm.
This presupposes that the AGI community is, on average, homogenous across the world and would behave accordingly. What if the political climates, traditions and culture make certain (powerful) countries less likely to be fearful given their own AGI pool?
In otherwords, if country A distrusts their staff more than country B due to political/economic/cultural factors, country A would be behind in the AGI arms race, which would lead to the “even if I hold onto my morals, we’re still heading into the abyss” attitude. I could see organizations or governments rationalizing against the community moral pledge in this way by highlighting the futility of slowing down the research.
The AGI community is tiny today. As it grows, its future composition will be determined by the characteristic of the tiny seed that expands.
I won’t claim that the future AGI community will be homogeneous, but it may be possible to establish norms starting today.
Indeed. Just imagine the fear of the next Snowden in the NSA, and trying to work out just how many past Snowdens they’ve had who took their secrets to the enemy rather than the public.
Yes, exactly.
You’ve made my point clearly—and perhaps I didn’t make it clearly enough in my post. I was focusing not on a leak in itself, but on what suspicion can do to an organization. As I described it, the suspicion would “cast a shadow” and “hover over” the project.
At this point, NSA may well be looking for anyone who expressed hacker/cypherpunk/copyfighter sentiments. Not that these need to disqualify someone from serving in the NSA, but at this point, the NSA is probably pretty suspicious.
I would like to agree with you but experience says otherwise. Tyrants have always been able to find enough professionals with dubious morals to further their plans.
In World War I, German Jewish scientists contributed to the German war effort. In World War II, refugee scientist contributed to the Allied war effort. Tyrants can shoot themselves in the foot quite effectively.
A few top physicists were left in Germany, including Heisenberg, but it was not enough to move the project forward, and it’s suspected that Heisenberg may have deliberately sabotaged the project.
But you have a point. So long as AGI is at the cutting edge, only a handful of top people can move it forward. As Moore’s Law of Mad Science has its effect, “ordinary” scientists will be enough.
(And to make it clear, I am not suggesting that the US government is tyrannical.)
There are plenty cases where government puts a bunch of incompetent people on a project and the project fails.
if the project does not take safety into account, we want exactly this—so long as it doesn’t get close enough to success that failure involves paper-clipping the world.
I suspect that this pledge would be just as “effective” as other pledges, such as the pledge of abstinence, for all the same reasons. The person taking the pledge does not really know him or herself (current or future) well enough to reliably precommit.
I agree.
But the threat would hover over every AGI project. You’d have a team of people, many of whom took the pledge (or may have done so without everyone knowing), and everyone would know that the project has to rely on everyone continuing to violate their earlier pledge.
Also, since the team would be smart people aware of AGI progress outside their organization, they would know that they are violating a deep deontological norm based on clear consequentalist principles. True, people find justifications for what they do, on the assumption that possibly-successful AGI researchers are smart and also have some rationalism in them, this idea would also cast a shadow.
Not only researchers, but their managers would no this.
And if you think that a leak is valuable—all it takes is one person to do it.
I don’t really like seeing this type of topic, because most of the interesting things that can be said on the matter shouldn’t be discussed in public.
Yes, I thought of that. But unless there are clear guidelines, it’s impossible to tiptoe around all sorts of subjects, worrying that we shouldn’t talk about them.
And as with sensitive topics in national security—e.g., Stuxnet, etc—it’s accepted that people without security clearance can natter on all they want; it’s the people who are in the know who have to be cautious about speaking publicly.
Just a minor nitpick: The intelligence community had nothing to do with the manhattan project; work on nuclear weapons was initiated by direct presidential executive order, creating an entirely new committee for nuclear weapons research, under the control of the US Army (the Department of the Army, the Army’s intelligence service, had little to do with the program itself).
Thanks, I edited it.
Hmmmm, now potential ethical AGI researchers have a page with that text in their browser history.
...which is kind of the point :-)
No, Kyre is saying that potential ethical AGI researchers have this page in their history, which will likely lead to the government not hiring them for this project. Meaning there will be less ethical researchers on the project.
EDIT: Never mind, gjm and James_Miller already said the same thing.
Right, this is the moral problem of “dirty hands” (Sartre). Do the good people stay out and keep their hands clean, leaving the field to the less scrupulous. Or do the join the system and try to achieve change from within. It’s not easy to answer such questions.
It is indeed not easy to answer such questions, but it’s rather worth trying, no?
Evaluating the Feasibility of SI’s Plans link is broken.
Thank you, I fixed it.
Hardly. Sabotaging unsafe projects is good. But exposing them may not be, if it creates other unsafe projects.
Indeed, it seems implausible that exposing unsafe projects to the general public is the best way to sabotage them—anyone who already had the clout to create such a secret project is unlikely to stop, either in the wake of a leak or because secrecy precautions to prevent one are too cumbersome.
Mind you, I haven’t thought in much depth about this—I don’t anticipate being in such a position—but if the author of this has, they should present their thoughts in the article rather than jumping straight to methods immediately after outlining the concept.
Right, Eliezer also pointed out that exposing a project does not stop it from continuing.
However, project managers in intelligence organizations consider secrecy to be very important. The more they fear exposure, the more they will burden their own project with rules.
Also, on occasion, some secret projects have indeed been stopped by exposure, if they violate laws or ethical rules and if the constellation of political forces is right. However, this is less important in my argument than the idea of slowing down a project as mentioned above.
Yeah, I hadn’t seen the other comments saying the same thing, or your replies to them. Maybe add that to the bottom or something?
I don’t think it quite answers my objection, though. Two points:
One: if a project is legitimately making progress on SIAI, in secret; exposing them will most likely create more unsafe projects rather than reducing existential risk (unless you think it’s REALLY CLOSE, this will shut them down permanently, and the next ones will be sufficiently slower to be worth it.)
Two: given one, how can we expect to encourage Snowden-esque consequentialist leakers? We would need, as Eliezer has put it elsewhere, a whole anti-epistemology to support our Noble Lie.
Where’s the Noble Lie? The whole point is to decide if encouraging leaking is a good thing; a leaker by definition is encouraging leaks.
If actually leaking anything does serious harm, then persuading people that leaking is a good idea—in order to create an atmosphere of leaking—is lying, because leaking is a bad idea. Goes the theory.
It may or may not follow that encouraging leaking is also a bad idea—this gets tricky depending on whether you expect the paranoia to prevent any actual uFAI projects, and whether you can use that to persuade people to commit to leaking instead.
Would you expect this approach to actually prevent rather than compromise every project? That’s another argument, I guess, and one I haven’t commented on yet.
Not proven.
No, not “proven” but highly likely.
The likelihood hasn’t been proven either.This , this and this
How does the Orthogonality Thesis help your point?
Those aren’t terribly helpful or persuasive arguments. And the second (broken link) when repaired is I believe supposed to link to http://lesswrong.com/lw/bfj/evidence_for_the_orthogonality_thesis/68np yes? That’s not really that helpful, since that just amounts to saying that an AI won’t be a random point in mind space and that some possible methods (partcularly uploads) might not be awful. That’s not exactly very strong as arguments go.
It’s not strong in the sense of reducing the likelihood of uFAI to 0. It strong enough to disprove a confident “will be unfriendly”. Note that the combination of low likleihood and high impact (and asking for money to solve the problem) is a Pascal’s mugging.
So how low a likelyhood do you need before it is a Pascal’s Mugging? 70%? 50%? 10%? 1%? Something lower?
That’s not my problem. It’s MIRIs problem to argue that the likelihood is above their threshold.
… nnnot if your goal is “find out whether or not AI existential risk is a problem,” and not “win an argument with MIRI”.
Do the contradictions in the Bible matter? Are athiesist strying to save their souls, or win an argument with beleivers?
You’ve argued that this is a Pascal’s mugging. So where do you set that threshold?
I argue that a sufficiently low likelihood is a P’s M, by MIRI’s definition, so MIRI needs to show the likelihood is above that threshold.
I fail to follow that logic. There’s not some magic opinion associated with MIRI that’s relevant to this claim. MIRI’s existence or opinions of how to approach this doesn’t alter at all whether or not this is an existential threat that needs to be taken seriously, or whether the orthogonality thesis is plausible, or any of the other issues. That’s an example of the genetic fallacy.
Whether or not anyone should believe that this is an existential threat that needs to be taken seriously depends on whether or not the claim can be justified, and only MIRI is making these specific version for the claim. You are trying to argue “never mind the justification, look at the truth”, but truth is not knowable except by justifying claims. If MIRI/LW is making a kind of claim, a P’S M (as defined by MIRI/LW) that MIRI/LW separately maintains is not a kind of claim that should be believed, then MIRI/LW is making incoherent claims (like “Don’t believe holy books, but believe the Bible).
Your first link is broken.