Wow, chill out, Eliezer. You’re probably among the top 10, certainly in the top 20, most-intelligent people I’ve met. That’s good enough for anything you could want to do. You are ranked high enough that luck, money, and contacts will all be more important factors for you than some marginal increase in intelligence.
Phil_Goetz5
What I think is a far more likely scenario than missing out on the mysterious essence of rightness by indulging the collective human id, is that what ‘humans’ want as a complied whole is not what we’ll want as individuals. Phil might be aesthetically pleased by a coherent metamorality, and distressed if the CEV determines what most people want is puppies, sex, and crack. Remember that the percentage of the population that actually engages in debates over moral philosophy is diminishingly small, and everyone else just acts, frequently incoherently.
Ooh! I vote for puppies, sex, and crack.(Just not all at the same time.)
Eliezer says:
As far as I can tell, Phil Goetz is still pursuing a mysterious essence of rightness—something that could be right, when the whole human species has the wrong rule of meta-morals.
Eliezer,
I have made this point twice now, and you’ve failed to comprehend it either time, and you’re smart enough to comprehend it, so I conclude that you are overconfident. :)
The human species does not consciously have any rule of meta-morals. Neither do they consciously follow rules to evolve in a certain direction. Evolution happens because the system dynamics cause them to happen. There is a certain subspace of possible (say) genomes that is, by some objective measures, “good”.
Likewise, human morality may have evolved in ways that are “good”, without humans knowing how that happened. I’m not going to try to figure out here what “good” might mean; but I believe the analogy I’m about to make is strong enough that you should admit this as a possibility. And if you don’t, you must admit (which you haven’t) my accusation that CEV is abandoning the possibility that there is such a thing as “good”.
(And if you don’t admit any possibility that there is such a thing as goodness, you should close up shop, go home, and let the paperclipping AIs take over.)
If we seize control over our physical and moral evolution, we’d damn well better understand what we’re replacing. CEV means replacing evolution with a system whereby people vote on what feature they’d like to evolve next.
I know you can understand this next part, so I’m hoping to hear some evidence of comprehension from you, or some point on which you disagree:
Dynamic systems can be described by trajectories through a state space. Suppose you take a snapshot of a bunch of particles traveling along these trajectories. For some open systems, the entropy of the set of particles can decrease over time. (You might instead say that, for the complete closed system, the entropy of the projection of a set of particles onto a manifold of its space can decrease. I’m not sure this is equivalent, but my instinct is that it is.) I will call these systems “interesting”.
For a dynamic system to be interesting, it must have dimensions or manifolds in its space along which trajectories contract; in a bounded state space, this means that trajectories will end at a point, or in a cycle, or in a chaotic attractor.
We desire, as a rule of meta-ethics, for humanity to evolve according to rules that are interesting, in the sense just described. This is equivalent to saying that the complexity of humanity/society, by some measure, should increase. (Agree? I assume you are familiar enough with complex adaptive systems that I don’t need to justify this.)
A system can be interesting only if there is some dynamic causing these attractors. In evolution, this dynamic is natural selection. Most trajectories for an organism’s genome, without selection, would lead off of the manifold in which that genome builds a viable creature. Without selection, mutation would simply increase the entropy of the genome. Natural selection is a force pushing these trajectories back towards the “good” manifold.
CEV proposes to replace natural selection with (trans)human supervision. You want to do this even though you don’t know what the manifold for “good” moralities is, nor what aspects of evolution have kept us near that manifold in the past. The only way you can NOT expect this to be utterly disastrous, is if you are COMPLETELY CERTAIN that morality is arbitrary, and there is no such manifold.
Since there OBVIOUSLY IS such a manifold for “fitness”, I think the onus is on you to justify your belief that there is no such manifold for “morality”. We don’t even need to argue about terms. The fact that you put forth CEV, and that you worry about the ethics of AIs, proves that you do believe “morality” is a valid concept. We don’t need to understand that concept; we need only to know that it exists, and is a by-product of evolution. “Morality” as developed further under CEV is something different than “morality” as we know it, by which I mean, precisely, that it would depart from the manifold. Whatever the word means, what CEV would lead to would be something different.
CEV makes an unjustified, arbitrary distinction between levels. It considers the “preferences” (which I, being a materialist, interpret as “statistical tendencies” of organisms, or of populations; but not of the dynamic system. Why do you discriminate against the larger system?
Carl writes,
If Approach 2 fails to achieve the aims of Approach 1, then humanity generally wouldn’t want to pursue Approach 1 regardless. Are you asserting that your audience would tend to diverge from the rest of humanity if extrapolated, in the direction of Approach 1?
Yes; but reverse the way you say that. There are already forces in place that keep humanity evolving in ways that may be advantageous morally. CEV wants to remove those forces without trying to understand them first. Thus it is CEV that will diverge from the way human morality has evolved thus far.
It sounds to me like this is leading towards collective extrapolated volition, and that you are presenting it as “patching” your previous set of beliefs so as to avoid catastrophic results in case life is meaningless.
It’s not a patch. It’s throwing out the possibility that life is not meaningless. Or, at least, it now opens up a big security hole for a set of new paths to catastrophe.
Approach 1: Try to understand morality. Try to design a system to be moral, or design a space for that system in which the gradient of evolution is similar to the gradient for morality.
Approach 2: CEV.
If there is some objective aspect to morality—perhaps not a specific morality, but let us say there are meta-ethics, rules that let us evaluate moral systems—then approach 1 can optimize above and beyond human morality.
Approach 2 can optimize accomplishment of our top-level goals, but can’t further-optimize the top-level goals. It freezes-in any existing moral flaws at that level forever (such flaws do exist if there is an objective aspect to morality). Depending on the nature of the search space, it may inevitably lead to moral collapse (if we are at some point in moral space that has been chosen by adaptive processes that keep that point near some “ideal” manifold, and trajectories followed through moral space via CEV diverge from that manifold).
Eliezer—Consider maximizing y in the search space y = - vector_length(x). You can make this space as large as you like, by increasing the range or the dimensionality of x. But it does not get any more difficult, whether you measure by difficulty, power needed, or intelligence needed.
I thought about this a bit more last night. I think the right justification for religion—which is not one that any religious person would consciously agree with—is that it does not take on faith the idea that truth is always good.
Reductionism aims at learning the truth. Religion is inconsistent and false—and that’s a feature, not a bug. Its social purpose is to grease the wheels of society where bare truth would create friction.
For example: In Rwanda, people who slaughtered the families of other people in their village, are now getting out of jail and coming back to live with the surviving relatives of their victims in the same villages. Rwanda needs this to happen; there are so many killers and conspirators, that they can’t keep them in jail or kill them—these killers are a significant part of their nation’s work force. Also, this would start the war all over again.
I have heard a few accounts of how they persuade the surviving relatives to forgive and live with the killers. They agree that the only way to do this is by using religious arguments.
Perhaps a true rationalist could be persuaded to leave the killer of their family alone, on grounds of self-interest. I’m easily more rational than 99.9% of the population, but I don’t think I’m that rational.
If we had a population of purely rational thinking machines, perhaps we would need no religion. But since we have only humans to work with, it may play a valid role where the irrational nature of humans and the rational truth of science would, together, lead to disaster.
Once, in a LARP, I played Isaac Asimov on a panel which was arguing whether vampires were real. It went something like this (modulo my memory): I asked the audience to define “vampire”, and they said that vampires were creatures that lived by drinking blood.
I said that mosquitoes were vampires. So they said that vampires were humanoids who lived by drinking blood.
I said that Masai who drank the blood of their cattle were vampires. So they said that vampires were humanoids who lived by drinking blood, and were burned by sunlight.
I (may have) said that a Masai with xeroderma pigmentosum was a vampire. And so on.
My point was that vampires were by definition not real—or at least, not understandable—because any time we found something real and understandable that met the definition of a vampire, we would change the definition to exclude it.
(Strangely, some mythical creatures, such as vampires and unicorns, seem to be defined in a spiritual way; whereas others, such as mermaids and centaurs, do not. A horse genetically engineered to grow a horn would probably not be thought of as a “real” unicorn; a genenged mermaid probably would be admitted to be a “real” mermaid.)
I had a similar, shorter conversation with a theologian. He had hired me to critique a book he was writing, which claimed that reductionist science had reached its limits, and that it was time to turn to non-reductionist science.
The examples he gave were all phenomena which science had difficulty explaining, and which he claimed to explain as being irreducibly complex. For instance, because people had difficulty explaining how cells migrate in a developing fetus, he suggested (as Aristotle might have) that the cells had an innate fate or desire that led them to the right location.
What he really meant by non-reductionist science, was that as a “non-reductionist scientist”, one is allowed to throw up one’s hands, and say that there is no explanation for something. A claim that a phenomenon is supernatural is always the assertion that something has no explanation. (I don’t know that it needs to be presented as a mental phenomenon, as Eliezer says.) So to “do” non-reductionist science is simply to not do science.
It should be possible, then, for a religious person to rightly claim that their point of view is outside the realm of science. If they said, for instance, that lightning is a spirit, that is not a testable hypothesis.
In practice, religions build up webs of claims, and of connections to the non-spiritual world, that can be tested for consistency. If someone claims not just that lightning is a spirit, but that an anthropomorphic God casts lightning bolts at sinners, that is a testable hypothesis. Once, when I was a Christian, lightning struck the cross behind my church. This struck me as strong empirical evidence against the idea that God directed every bolt. (I suppose one could interpret it as divine criticism of the church. The church elders did not, however, pursue that angle.)
- Aug 22, 2012, 9:48 AM; 1 point) 's comment on Rationality Quotes August 2012 by (
Perhaps this is how we generally explain the actions of others. The notion of a libertarian economist who wants to deregulate industry because he has thought about it and decided it is good for everyone in the long run, would be about as alien to most people as an AI. They find it much more believable that he is a tool of corporate oppression.
Whether this heuristic reduction to the simplest explanation is wrong more often than it is right, is another question.
There are several famous science fiction stories about humans who program AIs to make humans happy, which then follow the letter of the law and do horrible things. The earliest is probably “With folded hands”, by Jack Williamson (1947), in which AIs are programmed to protect humans, and they do this by preventing humans from doing anything or going anywhere. The most recent may be the movie “I, Robot.”
I agree with E’s general point—that AI work often presupposes that the AI magically has the same concepts as its inventor, even outside the training data—but the argument he uses is insidious and has disastrous implications:
Which is the correct classification? This is not a property of the training data; it is a property of your preferences (or, if you prefer, a property of the idealized abstract dynamic you name “right”).
This is the most precise assertion of the relativist fallacy than I’ve ever seen. It’s so precise that its wrongness should leap out at you. (It’s a shame that most relativists don’t have the computational background for me to use it to explain why they’re wrong.)By “relativism”, I mean (at the moment) the view that almost everything is just a point of view: There is no right or wrong, no beauty or ugliness. (Pure relativism would also claim that 2+2=5 is as valid as 2+2=4. There are people out there who think that. I’m not including that claim in my temporary definition.)
The argument for relativism is that you can never define anything precisely. You can’t even come up with a definition for the word “game”. So, the argument goes, whatever definition you use is okay. Stated more precisely, it would be Eliezer’s claim that, given a set of instances, any classifier that agrees with the input set is equally valid.
The counterargument is, in part, that some classifiers are better than others, even when all of them satisfy the training data completely. The most obvious criterion to use is the complexity of the classifier.
Eliezer’s argument, if he followed it through, would conclude that neural networks, and induction in general, can never work. The fact is that it often does.
Phil Goetz, why should I care what sort of creatures the universe “tends to produce”? What makes this a moral argument that should move me? Do you think that most creatures the universe produces must inevitably evolve to be moved by such an argument?
I stated the reason:
We MUST make this meta-level argument that the universe inherently produces creatures with pretty-valuable values. We have no other way of claiming to be better than pebble-sorters.
I don’t think that we can argue for our framework of ideas from within our framework of ideas. If we continue to insist that we are better than pebble-sorters, we can justify it only by claiming that the processes that lead to our existence tend to produce good outcomes, whereas the hypothetical pebble-sorters are chosen from a much larger set of possible beings, with a much lower average moral acceptability.
A problem with this is that all sorts of insects and animals exist with horrifying “moral systems”. We might convince ourselves that morals improve as a society becomes more complex. (That’s just a thought in postscript.)
One possible conclusion—not one that I have reached, but one that you might conclude if the evidence comes out a certain way—is that the right thing to do is not to make any attempt to control the morals of AIs, because general evolutionary processes may be better at designing morals than we are.
Thinking about this post leads me to conclude that CEV is not the most right thing to do. There may be a problem with my reasoning, in that it could also be used by pebble-sorters to justify continued pebble-sorting. However, my reasoning includes the consequence that pebble-sorters are impossible, so that is a non-issue.
Think about our assumption that we are in fact better than pebble-sorters. It seems impossible for us to construct an argument concluding this, because any argument we make presumes the values we are trying to conclude.
Yet we continue to use the pebble-sorters, not as an example of another, equally-valid ethical system, but as an example of something wrong.
We can justify this by making a meta-level argument that the universe is biased to produce organisms with relatively valuable values. (I’m worried about the semantics of that statement, but let me continue.) Pebble-sorting, and other futile endeavors, are non-adaptive, and will lose any evolutionary race to systems that generate increased complexity (from some energy input).
We MUST make this meta-level argument that the universe inherently produces creatures with pretty-valuable values. We have no other way of claiming to be better than pebble-sorters.
Given this, we could use CEV to construct AIs… but we can also try to understand WHY the universe produces good values. Once we understand that, we can use the universe’s rules to direct the construction of AIs. This could result in AIs with wildly different values than our own, but it may be more likely to result in non-futile AIs, or to produce more-optimal AIs (in terms of their values).
It may, in fact, be difficult or impossible to construct AIs that aren’t eventually subject to the universe’s benevolent, value-producing bias—since these AIs will be in the universe. But we have seen in human history that, although there are general forces causing societies with some of our values to prosper, we nonetheless find societies in local minima in which they are in continual warfare, pain, and poverty. So some effort on our part may increase the odds of, or the decrease the time until, a good result.
Okay, I realize you’re going to read that and say, “It’s obviously not good enough for things requiring superhuman intelligence!”
I meant that, if you compare your attributes to those of other humans, and you sort those attributes, with the one that presents you the most trouble in attaining your goal at the top, intelligence will not be near the top of that list for you, for any goal.