My true rejection
Here’s why I’m not going to give money to the SIAI any time soon.
Let’s suppose that Friendly AI is possible. In other words, it’s possible that a small subset of humans can make a superhuman AI which uses something like Coherent Extrapolated Volition to increase the happiness of humans in general (without resorting to skeevy hacks like releasing an orgasm virus).
Now, the extrapolated volition of all humans is probably a tricky thing to determine. I don’t want to get sidetracked into writing about my relationship history, but sometimes I feel like it’s hard to extrapolate the volition of one human.
If it’s possible to make a Friendly superhuman AI that optimises CEV, then it’s surely way easier to make an unFriendly superhuman AI that optimises a much simpler variable, like the share price of IBM.
Long before a Friendly AI is developed, some research team is going to be in a position to deploy an unFriendly AI that tries to maximise the personal wealth of the researchers, or the share price of the corporation that employs them, or pursues some other goal that the rest of humanity might not like.
And who’s going to stop that happening? If the executives of Corporation X are in a position to unleash an AI with a monomaniacal dedication to maximising the Corp’s shareholder value, it’s probably illegal for them not to do just that.
If you genuinely believe that superhuman AI is possible, it seems to me that, as well as sponsoring efforts to design Friendly AI, you need to (a) lobby against AI research by any groups who aren’t 100% committed to Friendly AI (pay off reactionary politicians so AI regulation becomes a campaign issue, etc.) (b) assassinate any researchers who look like they’re on track to deploying an unFriendly AI, then destroy their labs and backups.
But SIAI seems to be fixated on design at the expense of the other, equally important priorities. I’m not saying I expect SIAI to pursue illegal goals openly, but there is such a thing as a false-flag operation.
While Michelle Bachmann isn’t talking about how AI research is a threat to the US constitution, and Ben Goertzel remains free and alive, I can’t take the SIAI seriously.
If IBM makes a superintelligent AI that wants to maximize their share price, it will probably do something less like invent brilliant IBM products, and more like hack the stock exchange, tell its computers to generate IBM’s price by calling on a number in the AI’s own memory, and then convert the universe to computronium in order to be able to represent as high a number as possible.
To build a superintelligence that actually maximizes IBM’s share price in a normal way that the CEO of IBM would approve of would require solving the friendly AI problem but then changing a couple of lines of code. Part of what SIAI should be (and as far as I know, is) doing is trying to convince people like selfish IBM researchers that making an UnFriendly superintelligence would be a really bad idea even by their own selfish standards.
Another part is coming up with some friendly AI design ideas so that, if IBM is unusually sane and politicians are unusually sane and everyone is sane and we can make it to 2100 without killing ourselves via UnFriendly AI, then maybe someone will have a Friendly AI in the pipeline so we don’t have to gamble on making it to 2200.
Also, the first rule of SIAI’s assassinate unfriendly AI researchers program is don’t talk about the assassinate unfriendly AI researchers program.
Not if their goal is deterrence, which leads me to conclude that they don’t have an assassination program.
Taking murder laws into account, I expect a scenario where UFAI researchers tend to turn up dead under mysterious circumstances without any group credibly claiming responsibility would more effectively deter UFAI research than one where a single rogue research institute openly professes an assassination policy.
Hypothetically speaking.
Do-gooding terrorists relatively frequenly claim responsibility for their actions. For instance, consider the case of Anonymous.
Considering that nearly all terrorists probably think of themselves as do-gooders, I’m not sure how you separate a pool of actual do-gooding terrorists large enough to draw meaningful inferences about it.
Terrorist groups relatively frequenly claim responsibility for their actions.
That assumes that being Friendly to all of humanity is just as easy as being Friendly to a small subset.
Surely it’s much harder to make all of humanity happy than to make IBM’s stockholders happy? I mean, a FAI that does the latter is far less constrained, but it’s still not going to convert the universe into computronium.
Not really. “Maximize the utility of this one guy” isn’t much easier than “Maximize the utility of all humanity” when the real problem is defining “maximize utility” in a stable way. If it were, you could create a decent (though probably not recommended) approximation to Friendly AI problem just by saying “Maximize the utility of this one guy here who’s clearly very nice and wants what’s best for humanity.”
There are some serious problems with getting something that takes interpersonal conflicts into account in a reasonable way, but that’s not where the majority of the problem lies.
I’d even go so far as to say that if someone built a successful IBM-CEO-utility-maximizer it’d be a net win for humanity, compared to our current prospects. With absolute power there’s not a lot of incentive to be an especially malevolent dictator (see Moldbug’s Fhnargl thought experiment for something similar) and in a post-scarcity world there’d be more than enough for everyone including IBM executives. It’d be sub-optimal, but compared to Unfriendly AI? Piece of cake.
Fnargl.
[Yvain crosses “get corrected on spelling of ‘Fnargl’” off his List Of Things To Do In Life]
Glad to be of service!
If somebody was going to build an IBM profit AI, (of the sort of godlike AI that people here talk about) it would almost certainly end up doubling as the IBM CEO Charity Foundation AI.
It seems quite a bit easier to me! Maybe not 7 billion times easier—but heading that way.
That would work—if everyone agreed to trust them and their faith was justified. However, there doesn’t seem to be much chance of that happening.
It is more work for the AI to make all of humanity happy than a smaller subset, but it is not really more work for the human development team. They have to solve the same Friendliness problem either way.
For a greatly scaled down analogy, I wrote a program that analyzes stored procedures in a database and generates web services that call those stored procedures. I run that program on our database which currently has around 1800 public procedures, whenever we make a release. Writing that program was the same amount of work for me as if there were 500 or 5000 web services to generate instead of 1800. It is the program that has to do more or less work if there are more or fewer procedures.
I downvoted you for suggesting in public that the SIAI kill people. Even if that’s a good idea, which it probably isn’t, the negative PR and subsequent loss of funding from being seen talking about it is such a bad idea that you should definitely not be talking about it on a public website. If you really want the SIAI to kill people, PM or email either 1) the people who would actually be able to make that change to SIAI policy, or 2) people you think might be sympathetic to your position (to have more support when you suggest 1).
I’m not seriously suggesting that. Also, I am just some internet random and not affiliated with the SIAI.
I think my key point is that the dynamics of society are going to militate against deploying Friendly AI, even if it is shown to be possible. If I do a next draft I will drop the silly assassination point in favour of tracking AGI projects and lobbying to get them defunded if they look dangerous.
I would not make non serious suggestions in a post titled “My True Rejection”.
You need to think much more carefully about (a) the likely consequences of doing this (b) the likely consequences of appearing to be a person or organization that would do this.
See also.
Oh, I’m not saying that SIAI should do it openly. Just that, according to their belief system, they should sponsor false-flag cells who would (perhaps without knowing the master they truly serve). The absence of such false-flag cells indicates that SIAI aren’t doing it—although their presence wouldn’t prove they were. That’s the whole idea of “false-flag”.
If you really believed that unFriendly AI was going to dissolve the whole of humanity into smileys/jelly/paperclips, then whacking a few reckless computer geeks would be a small price to pay, ethical injunctions or no ethical injunctions. You know, “shut up and multiply”, trillion specks, and all that.
It seems to you that according to their belief system.
Given how obvious the motivation is, and the high frequency with which people independently conclude that SIAI should kill AI researchers, think about the consequences of anyone doing this for anyone actively worried about UFAI.
Ethical injunctions are not separate values to be traded off against saving the world; they’re policies you follow because it appears, all things considered, that following them has highest expected utility, even if in a single case you fallibly perceive that violating them would be good.
(If you didn’t read the posts linked from that wiki page, you should.)
You’re right that the motivation would be obvious today (to a certain tiny subset of geeky people). But what if there had been a decade of rising anti-AI feeling amongst the general population before the assassinations? Marches, direct actions, carried out with animal-rights style fervour? I’m sure that could all be stirred up with the right fanfiction (“Harry Potter And The Monster In The Chinese Room”).
I understand what ethical injunctions are—but would SIAI be bound by them given their apparent “torture someone to avoid trillions of people having to blink” hyper-utilitarianism?
If you think ethical injunctions conflict with hyper-utilitarianism, you don’t understand what they are. Did you read the posts?
Yes.
This sounds in the direction of modeling AGI researchers as selfish mutants. Other motivations (e.g. poor Friendliness theories) and accidents (by researchers who don’t understand the danger, or underestimate what they’ve built) are also likely.
This matters, since if AGI researchers aren’t selfish mutants, you can encourage them to see the need for safety, and this is one goal of SIAI’s outreach.
At the very least, this (or anything that causes lots of people with power/resources to take AI more seriously) has to be weighed against the risk of causing the creation of more serious AGI/”FAI” projects. (I expect communicating enough reasoning to politicians, the general public, etc. to make them able to distinguish between plausible and hopeless “FAI” projects to be basically impossible.)
Also, SIAI is small and has limited resources, and in particular, doesn’t have the sort of political connections that would make this worth trying.
AGI researchers might not be selfish mutants, but they could still be embedded in corporate structures which make them act that way. If they are a small startup where researchers are in charge, outreach could be useful. What if they’re in a big corporation, and they’re under pressure to ignore outside influences? (What kind of organisation is most likely to come up with a super-AI, if that’s how it happens?)
If FAI does become a serious concern, nothing would stop corporations from faking compliance but actually implementing flawed systems, just as many software companies put more effort into reassuring customers that their products are secure than actually fixing security flaws.
Realistically, how often do researchers in a particular company come to realise what they’re doing is dangerous and blow the whistle? The reason whistleblowers are lionised in popular culture is precisely because they’re so rare. Told to do something evil or dangerous, most people will knuckle under, and rationalise what they’re doing or deny responsibility.
I once worked for a company which made dangerously poor medical software—an epidemiological study showed that deploying their software raised child mortality—and the attitude of the coders was to scoff at the idea that what they were doing could be bad. They even joked about “killing babies”.
Maybe it would be a good idea to monitor what companies are likely to come up with an AGI. If you need a supercomputer to run one, then presumably it’s either going to be a big company or an academic project?
Simpler to have infratructure to monitor all companies: corporate reputation systems.
Michelle Bachmann wants to be President primarily because it would allow her to advance Friendly AI research, but she keeps this motivation hidden for fear that it would cause people to question her seriousness.
Your proposals are the kind of strawman utilitarianism that turns out to be both wrong and stupid, for several reasons.
Also, I don’t think you understand what the SIAI argues about what an unFriendly intelligence would do if programmed to maximize, say, the personal wealth of its programmers. Short story, this would be suicide or worse in terms of what the programmers would actually want. The point at which smarter-than-human AI could be successfully abused by a selfish few is after the problem of Friendliness has been solved, rather than before.
I freely admit there are ethical issues with a secret assassination programme. But what’s wrong with lobbying politicians to retard the progress of unFriendly AI projects, regulate AI, etc? You could easily persuade conservatives to pretend to be scared about human-level AI on theological/moral/job-preservation grounds. Why not start shaping the debate and pushing the Overton window now?
I do understand what SIAI argues what an unFriendly intelligence would do if programmed to maximize some financial metric. I just don’t believe that a corporation in a position to deploy a super-AI would understand or heed SIAI’s argument. After all, corporations maximise short-term profit against their long-term interests all the time—a topical example is News International.
Ah, another point about maximising. What if the AI uses CEV of the programmers or the corporation? In other words, it’s programmed to maximise their wealth in a way they would actually want? Solving that problem is a subset of Friendliness.
That’s not how the term is used here. Friendliness is prior to and separate from CEV, if I understand it correctly.
From the CEV document:
The position of the SIAI is that Ben Goertzel is extremely unlikely to create AI and therefore not dangerous.
You don’t need to have solved the AGI problem to have solved friendliness. That issue can be solved separately far before AGI even begins to become a threat, and then FAI and UFAI will be on basically equal footing.
Can you give me some references for the idea that “you don’t need to have solved the AGI problem to have solved friendliness”? I’m not saying it’s not true, I just want to improve this article.
Let’s taboo “solved” for a minute.
Say you have a detailed, rigorous theory of Friendliness, but you don’t have it implemented in code as part of an AGI. You are racing with your competitor to code a self-improving super-AGI. Isn’t it still quicker to implement something that doesn’t incorporate Friendliness?
To me, it seems like, even if the theory was settled, Friendliness would be an additional feature you would have to code into an AI that would take extra time and effort.
What I’m getting at is that, throughout the history of computing, the version of a system with desirable property X, even if the theoretical benefits of X are well known by the academy, has tended to be implemented and deployed commercially after the version without X. For example, it would have been better for the general public and web developers if web browsers obeyed W3C specifications and didn’t have any extra proprietary tags—but in practice, commercial pressures meant that companies made grossly non-compliant browsers for years until eventually they started moving towards compliance.
The “Friendly browser” theory was solved, but compliant and non-compliant browsers still weren’t on basically equal footing.
(Now, you might say that CEV will be way more mathematical and rigorous than browser specifications—but the only important point for my argument is that it will take more effort to implement than the alternative).
Now you could say that browser compliance is a fairly trvial matter, and corporations will be more cautious about deploying AGI. But the potential gain from deploying a super-AI first would surely be much greater than the benefit of supporting the blink tag or whatever—so the incentive to rationalise away the perceived dangers will be much greater.
If you have a rigorous, detailed theory of Friendliness, you presumably also know that creating an Unfriendly AI is suicide and won’t do it. If one competitor in the race doesn’t have the Friendliness theory or the understanding of why it’s important, that’s a serious problem, but I don’t see any programmer who understands Friendliness deliberately leaving it out.
Also, what little I know about browser design suggests that, say, supporting the blink tag is an extra chunk of code that gets added on later, possibly with a few deeper changes to existing code. Friendliness, on the other hand, is something built into every part of the system—you can’t just leave it out and plan to patch it in later, even if you’re clueless enough to think that’s a good idea.
OK, what about the case where there’s a CEV theory which can extrapolate the volition of all humans, or a subset of them? It’s not suicide for you to tell the AI “coherently extrapolate my volition/the shareholders’ volition”. But it might be hell for the people whose interests aren’t taken into account.
At that point, that particular company wouldn’t be able to build the AI any faster than other companies, so at that point it’s just a matter of getting an FAI out there first and have it optimize rapidly enough that it could destroy any UFAI that come along after.
For pure pragmatic reasons, peaceful methods would be still preferable to violent ones…
Why Terrorism Does Not Work
This is the first article to analyze a large sample of terrorist groups in terms of their policy effectiveness. It includes every foreign terrorist organization (FTO) designated by the U.S. Department of State since 2001. The key variable for FTO success is a tactical one: target selection. Terrorist groups whose attacks on civilian targets outnumber attacks on military targets do not tend to achieve their policy objectives, regardless of their nature. Contrary to the prevailing view that terrorism is an effective means of political coercion, the universe of cases suggests that, first, contemporary terrorist groups rarely achieve their policy objectives and, second, the poor success rate is inherent to the tactic of terrorism itself. The bulk of the article develops a theory for why countries are reluctant to make policy concessions when their civilian populations are the primary target.
Ouch!
That is a fairly likely outcome. It would represent business as usual. The entire history of life is one of some creatures profiting at the expense of other ones.
My point, then, is that as well as heroically trying to come up with a theory of Friendly AI, it might be a good idea to heroically stop the deployment of unFriendly AI.
Very large organisations do sometimes attempt to cut of their competitors’ air supply.
They had better make sure they have good secrecy controls if they don’t want it to blow up in their faces.
I’m generally operating under the assumption that this is clearly not an issue and seems so only for reasons that should obviously not be talked about given these assumptions. If you know what I mean.
I really don’t know what you mean.
I have no clue what you’re talking about and would really like to know. PM me if you really don’t want whatever it is to be publicly known.
Eh, I implied it heavily enough that it probably doesn’t matter: SIAI folks might have plans similar to those recommended by the OP, and SIAI donations might end up in such programs. I don’t think it’s all that likely, but I do think it’s almost completely correlated with whether or not it’s a good idea.