Thinking soberly about the context and consequences of Friendly AI
The project of Friendly AI would benefit from being approached in a much more down-to-earth way. Discourse about the subject seems to be dominated by a set of possibilities which are given far too much credence:
A single AI will take over the world
A future galactic civilization depends on 21st-century Earth
10n-year lifespans are at stake, n greater than or equal to 3
We might be living in a simulation
Acausal deal-making
Multiverse theory
Add up all of that, and you have a great recipe for enjoyable irrelevance. Negate every single one of those ideas, and you have an alternative set of working assumptions that are still consistent with the idea that Friendly AI matters, and which are much more suited to practical success:
There will always be multiple centers of power
What’s at stake is, at most, the future centuries of a solar-system civilization
No assumption that individual humans can survive even for hundreds of years, or that they would want to
Assume that the visible world is the real world
Assume that life and intelligence are about causal interaction
Assume that the single visible world is the only world we affect or have reason to care about
The simplest reason to care about Friendly AI is that we are going to be coexisting with AI, and so we should want it to be something we can live with. I don’t see that anything important would be lost by strongly foregrounding the second set of assumptions, and treating the first set of possibilities just as possibilities, rather than as the working hypothesis about reality.
[Earlier posts on related themes: practical FAI, FAI without “outsourcing”.]
- FAI, FIA, and singularity politics by 8 Nov 2012 17:11 UTC; 17 points) (
- 29 Jun 2013 1:02 UTC; 5 points) 's comment on For FAI: Is “Molecular Nanotechnology” putting our best foot forward? by (
- 1 Dec 2022 23:45 UTC; 4 points) 's comment on Superintelligent AI is necessary for an amazing future, but far from sufficient by (
That sounds terribly open-minded and inclusive. However, when it comes to the first three bullets:
The probability assigned to these scenarios completely and fundamentally changes all expectations of experience with the GAIs in question and all the ethical and practical considerations that are relevant are entirely different. For most intents and purposes you are advocating just talking about an entirely different thing than the one labelled FAI. Conversation which is conditional on one is completely incompatible with conversation conditional on the other (or, if made to be compatible it is stilted).
Conversation is at least possible when such premises are substituted. The main thing that is being changed is the priority that certain outcomes are assigned when calculating expected utilities. ie. Success vs failure on FAI production becomes an order of magnitude or two less important so probability of success to make a given investment worthwhile has to be larger.
When it comes to this one the line of thought that advocates the latter rather than the former seems inane and ridiculous but assuming the kind of thinking that lead to it doesn’t creep in to the rest of the discussion it wouldn’t be a huge problem.
These ones don’t seem to interfere with Friendly AI discussion much at all now. They can be—and are—fairly isolated to discussions that are about their respective subjects. Nothing need be changed in order to have discussions about FAI that don’t talk about those issues.
Integrity, and the entire premise of the website. “Strongly foregrounding assumptions” that are ridiculous just isn’t what we do here. In particular I refer to trying to aggressively privilege the hypothesis that:
No assumption that individual humans can survive even for hundreds of years, or that they would want to
Followed closely by the “We’re going to co-exist with AIs vaguely as equals” that is being implied between the lines an in particular by rejecting discussion premised on “A single AI will take over the world”.
I strongly reject the suggestion that we should pretend things are “given far too much credence” just because they aren’t “down to earth”. The “important thing that would be lost” is basically any point to having the conversation here at all. If you want to have “Sober” conversations at the expense of sane ones then have them elsewhere.
On these three issues—I’ll call them post-singularity concentration of power, size and duration of post-singularity civilization, and length of post-singularity lifespan—my preference is to make the low estimate the default, but to regard the high estimates as defensible.
In each case, there are arguments (increasing returns on self-enhancement, no sign of alien civilizations, fun theory / basic physics) in favor of the high estimate (one AI rules the world, the winner on Earth gets this Hubble volume as prize, no psychological or physical limits on lifespan shorter than that of the cosmos). And it’s not just an either-or choice between low estimate and high estimate; the low estimates could be off by just a few orders of magnitude, and that would still mean a genuinely transcendent future.
However, the real situation is that we just don’t know how much power could accrue to a developing AI, how big a patch of space and time humanity’s successors might get to claim as their own, and how much lifespan is achievable or bearable. Science and technology have undermined the intellectual basis for traditional ideas of what the limits are, but they haven’t ratified the transcendent scenarios either. So the “sober” thing to do is to say that AIs can become at least as smart as human beings, that we could live throughout the solar system, and that death by old age is probably unnecessary; and to allow that even more than that may be possible, with the extremes being represented by what I’ve just called the transcendent scenarios.
I think FAI is still completely relevant and necessary, even in the minimal scenario. UFAI can still wipe out the human race in that scenario; it would be just another genocide, just another extinction, in which one group of sapients displaces another, and history is full of such events. Similarly, the usual concerns of FAI—identifying the correct value system, stability under self-enhancement—remain completely relevant even if we’re just talking about AIs of roughly humanesque capabilities, as do concepts like CEV and reflective decision theory.
This has all become clear to me when I think what it would take for the concept of Friendly AI (whether under that name, or some other) to really gain traction, culturally and politically. I can make the case for the minimal future scenario; I can make the case that something beyond the minimal scenario is possible, and that we don’t know how much beyond. I can live with the existence of people who dream about ruling their own galaxy one day. But it would be very unwise to tie serious FAI advocacy to that sort of dreaming.
Maybe one day there will be mass support, even for that sort of transhumanism; but that will be the extremist wing of the movement, a bit like the role of fundamentalists in conservative American politics. They are part of the conservative coalition, so they matter politically, but when they actually take over, you get craziness.
That is much better starting point than the original post. Let’s explore those lower bounds:
Even if we can’t make something smarter than us, there’s no reason to think we can’t make something faster than us. Plus, it’s only a program, so it can duplicate itself, say, over the internet.
Even if it’s no smarter than us, its still can be pretty much unstoppable.
The sun is due to die some billions years in the future. Hominoidea only exist for 28 millions years, so, our potential is at least 2 orders of magnitude longer than our past. Add population size into the mix, (let’s say, 10 billions and stabilizing, while our past would be more like less that 1 million for most of the time), and we add 4 more orders of magnitude to the value of the future.
So, the potential of our future is at least a million times greater than our entire past. That’s one hell of a potential, for a conservative estimate.
I don’t have good “mainstream” for the feasibility of long lifespans. Like wedrifid, Id’ just avoid the subject if possible.
Now we can speculate about reasonable sounding worst-case and best-case scenarios:
Best case: Machines take over chores that make us unhappy. Lifespan is about 100 years, healthy lifespan is about 60-80 years. This goes on for the few billions years the Sun let us have. Not a paradise, but still much better than the current state of affairs.
Worst case: Skynet wants to maximize paperclips, we all die very soon. Or we don’t do AI at all, and we kill ourselves in another manner, very soon.
There. Does that sound mainstream-friendly enough?
Separate ‘discourse about the subject of friendly AI’, i.e. loose speculations about the future, the universe and everything else from ‘the project of friendly AI’, i.e. people trying to do actual math. Of the six assertions you make here only the one about acausal stuff seems relevant to actual research. The others are be important in deciding whether friendly AI is worth pursuing at all but if you start with the assumption that friendly AI matters, they don’t seem to matter much.
The assumptions that I criticize may be providing a lot of the motivation, but you can think that e.g. solving the problem of ethical stability under self-modification is important, without believing that stuff; and meanwhile, a lot of people must be encountering the concept of Friendly AI as part of a package which includes the futurist maximalism and the peculiar metaphysics. I suppose I’m saying that Friendly AI needs to be saved from the subculture that most vigorously supports it, because it contains ideas that really are important.
How about we plant seeds for a new culture intended to design mechanism-designing mechanisms? One seed institution will be this comment exchange. We can bootstrap from there.
One of these is not like the others. Whether a single AI takes over the world, or whether there are always multiple centers of power is something we choose. The rest are properties of the universe waiting to be discovered (or which we think we’ve discovered, as the case may be).
I think it’s an especially important choice, and I think that the world ends up looking much better, by my values, with one dominant AI rather than power diffused among many. This is not to say that one AI would make all the decisions, but rather that something has to be powerful enough to veto all the really bad ideas, and notice when others are playing with fire; and I don’t think very many entities can stably have that power at once. In human societies, spreading power out more reduces the damage if some of the humans turn out to be bad. Among AIs, the failure modes are different and I don’t think that’s the case any more; I think an insane AI does a more or less constant amount of damage (it destroys everything), so spreading out the power just increases the risk of one of them being bad.
Whether the recursive self-improvement is possible or not, is the property of the universe. Also other details, like how much additional time and energy is necessary for another increase in intelligence.
The answer to this question can make some outcomes more likely. For example, if recursive self-improvement is possible, and at some level you can get a huge increase in intelligence very quickly and relatively cheaply, one of the centers of power could easily overpower the other ones. Perhaps even in situations where every super-agent would read and analyze the source code of all other super-agents all the time; the increased intelligence could allow one of them to make changes that will seem harmless to the other ones.
On the other hand the multiple centers of power scenario is more likely if humankind spreads to many planets, and there is some natural limit how high an intelligence can become before it somehow collapses or starts needing insane amounts of energy; so no super-agent could be smart enough to conquer the rest of the world.
The idea of technological determinism suggests this is hubris.
is not a well supported statement. Two main trends argue against this
1 with technological gaps the number of center is th number of agents with tech
In lots of colonies there were many centers of power, then colonists with guns landed and there was 1 center. Post ww2 the us had the bomb and strategic superiority. The soviets matched tech and there were two power centers. When Britan ruled the seas there was one major naval power In many developing nations in the cold war there were 2 powers centers that corresponded to us and soviet supplied arms.
It’s not crazy to think AI or whatever technology it can invent as being another instance of tech superiority reducing the number of power centers
Tech also seems quite vulnerable to monocultures. Think of file formats, for example. In the early days there are often several formats, but after a while most of them go extinct and the survivors end up being universally used. Image display formats, for example, fall largely into two categories—formats that every computer knows how to display, and formats that hardly anybody uses at all. (Image editing formats are different, I know.) How many word processors have you used recently that can’t support .doc format ?
The most likely scenario is that there will be only one center of intelligence, and that although the intelligence isn’t really there yet, the center is. You’re using it now.
Nice post. A more general version of this assertion:
“If you are trying to convince someone to do X, use the most robust correct argument in favor of X, which is almost certainly not the one that makes the strongest assertions.”
While I agree with the main thrust of what you’re saying, I think the scenario you paint is not the same thing that Friendly AI (capitalized) is concerned with. Those seem like good reasons to think about Machine Ethics, for example. FAI is supposed to solve a specific sort of problem, which your assumptions seem to rule out.
I don’t see a difference of purpose between “Friendly AI” and “machine ethics”. Friendly AI is basically a school of thought about machine ethics—by far the best that I’ve seen, when it comes to depth of analysis, methodology, and ingenuity of concepts—that sprang up under unusual circumstances.
In terms of LW concepts, FAI tends to get grouped with “futurism and speculation”, and then that topic cluster is separated from “pure and applied rationality”. The perspective I want to foster sees more continuity between FAI and the study of human decision-making, and places a little more distance between FAI and the futurism/speculation cluster.
The concept of a hard takeoff plays a central role in how people think about FAI—perhaps this is the “specific sort of problem” you mean—but it can be motivated as a worst case that needs to be considered for safety’s sake, rather than as a prediction or dogma about what will actually happen.
The difference between FAI and Machine Ethics is the difference betwen “how do I make this robot not blow me up” and “how do I make this eldrich abomination not torture all of humanity for eternity”.
I thought another significant difference was that “Ethics” doesn’t even imply getting as far as “How do I?”. An Ethical discussion could center around “Would a given scenario be considered torture and is this better or worse than extinction?”
No, Machine Ethics is the field concerned with exactly the question of how to program ethical machines. For example, Arkin’s Governing Lethal Behavior in Autonomous Robots is a work in the field of Machine Ethics.
In principle, a philosopher could try to work in Machine Ethics and only do speculative work on, for example, whether it’s good to have robots that like torture. But inasmuch as that’s a real question, it’s relevant to the practical project.
I was under the impression that Machine Ethics is mostly being researched by Computer Science experts and AI or Neuro-something specialists in particular.
My prior for someone already doing research in IT concentrating their research efforts into “How do I code something that does X” is much higher than for someone doing research in, say, propagation of memes in animal populations or intergalactic lensing distortions (Dark Matter! *shivers*).
Pretty much a mix of people who know about machines and people who know about ethics. Arkin is a roboticist. Anderson and Anderson are a philosopher/computer scientist duo. Colin Allen is a philosopher, and I believe Wendell Wallach is too.
I’d argue that the best work is done by computing/robotics folks, yes.
Yes, that pretty well captures it.
That is only a superficial difference, a difference of scenario considered. If you put a bad actor from ordinary machine-ethics into a possible world where you can torture someone forever, or if you put a UFAI into a possible world where the most harm it can do is blow you up once, this difference goes away.
Designing an “ethical computer program” or a “friendly AI” is not about which possible world the program inhabits, it’s about the internal causality of the program and the choices it makes. The valuable parts of FAI research culture are all on this level. Associating FAI with the possible world of “post-singularity hell”, as if that is the essence of what distinguishes the approach, is an example of what I want to combat in this post.
The key difference is that in the case of a Seed AI, you need to find a way to make a goal system stable under recursive self-improvement. In the case of a toaster, you do not.
It’s useful to keep Friendly AI concerns in mind when designing ethical robots, since they potentially become a risk when they start to get more autonomous. But when you’re giving a robot a gun, the relevant ethical concerns are things like whether it will shoot civilians. The scope is relevantly different.
Really, there is a whole field out there of Machine Ethics, and it’s pretty well established that it’s up to a different sort of thing than what SIAI is doing. While some folks still conflate “Friendly AI” and “Machine Ethics”, I think it’s much better to maintain the distinction and consider FAI a subfield of Machine Ethics.
There will always be multiple centers of power What’s at stake is, at most, the future centuries of a solar-system civilization No assumption that individual humans can survive even for hundreds of years, or that they would want to
You give no reason why we should consider these as more likely than the original assumptions.
I’d like to suggest that it is important that the friendly AGI would hopefully also want to live with us. I’d further suggest that this is part of why efforts such as LW are important.
I don’t see how this can be avoided if the damn thing is so much smarter, it can only treat “normal” humans as pets, ants, or, best case, wild animals confined to a sanctuary for their own good.
“Pet”, “vermin”, and “wild animal” (as well as “livestock” and “working animal”) are all concepts that humans have come up with for our species’ relationships with other species that we’ve been living with since forever, and have developed both instincts and cultural practices to relate to. Why would you expect them to apply to an AI’s relationship to humans? Isn’t that a bit, well, anthropomorphizing?
Indeed it is, a bit. This is just an analogy meant to convey that humans aren’t likely to stop a foomed AI (or maybe a group of them, if such a term will even make sense) from doing what it wants, just like animals are powerless to stop determined humans.
Companies and governments are much smarter than humans. So far, none has taken over the world. Companies compete with other companies. Governments compete with other governments . Like that.
Are they? More powerful, maybe. Often wealthier. But what evidence do you have that they are smarter? They often act rather stupidly.
Collective intelligence won in collective intelligence vs groupthink.
I don’t see how this is relevant to the issue. Sure, an average organization is smarter than an average human, but it is not nearly as smart as a smart human, let alone a foomed AI.
Large organization can be as smart as the smartest human they can hire (the listening part is up to them of course)
Well, the idea that a single smart machine will take over the world because it will comprehensively trounce humans makes no sense. Smart machines are likely to compete with other smart machines, much as they do today.
give the Companies time. They’re making good progress
The governments too though. A company needs to overthrow all the governments to take over the world. Not an impossible task, perhaps, but it would be quite a revolution—and probably a bad one.
Ok, we have a serious problem with upvoating any criticism of common memes no mater how inane for fear of seeming phlygish. If this was about just about any other subject it’d be downvoted to oblivion.
That’s possible, but I find it more likely that we have a silent majority of people who are on the periphery of the community and are uncomfortable with a lot of the singularitarian memes. I think that a lot of internet communities with voting have a lot of people who vote but don’t comment or comment only occasionally, so you’d expect that voting would reflect more peripheral members more strongly than the discussion would.
The singularity have diverse connotations; is good to specify what memes you’re thinking.
I mean the ones that most prominent LWers endorse.