Contaminated by Optimism
Followup to: Anthropomorphic Optimism, The Hidden Complexity of Wishes
Yesterday, I reprised in further detail The Tragedy of Group Selectionism, in which early biologists believed that predators would voluntarily restrain their breeding to avoid exhausting the prey population; the given excuse was “group selection”. Not only does it turn out to be nearly impossible for group selection to overcome a countervailing individual advantage; but when these nigh-impossible conditions were created in the laboratory—group selection for low-population groups—the actual result was not restraint in breeding, but, of course, cannibalism, especially of immature females.
I’ve made even sillier mistakes, by the way—though about AI, not evolutionary biology. And the thing that strikes me, looking over these cases of anthropomorphism, is the extent to which you are screwed as soon as you let anthropomorphism suggest ideas to examine.
In large hypothesis spaces, the vast majority of the cognitive labor goes into noticing the true hypothesis. By the time you have enough evidence to consider the correct theory as one of just a few plausible alternatives—to represent the correct theory in your mind—you’re practically done. Of this I have spoken several times before.
And by the same token, my experience suggests that as soon as you let anthropomorphism promote a hypothesis to your attention, so that you start wondering if that particular hypothesis might be true, you’ve already committed most of the mistake.
The group selectionists did not deliberately extend credit to the belief that evolution would do the aesthetic thing, the nice thing. The group selectionists were doomed when they let their aesthetic sense make a suggestion—when they let it promote a hypothesis to the level of deliberate consideration.
It’s not like I knew the original group selectionists. But I’ve made analogous mistakes as a teenager, and then watched others make the mistake many times over. So I do have some experience whereof I speak, when I speak of instant doom.
Unfortunately, the prophylactic against this mistake, is not a recognized technique of Traditional Rationality.
In Traditional Rationality, you can get your ideas from anywhere. Then you weigh up the evidence for and against them, searching for arguments on both sides. If the question hasn’t been definitely settled by experiment, you should try to do an experiment to test your opinion, and dutifully accept the result.
“Sorry, you’re not allowed to suggest ideas using that method” is not something you hear, under Traditional Rationality.
But it is a fact of life, an experimental result of cognitive psychology, that when people have an idea from any source, they tend to search for support rather than contradiction—even in the absence of emotional commitment (see link).
It is a fact of life that priming and contamination occur: just being briefly exposed to completely uninformative, known false, or totally irrelevant “information” can exert significant influence on subjects’ estimates and decisions. This happens on a level below deliberate awareness, and that’s going to be pretty hard to beat on problems where anthropomorphism is bound to rush in and make suggestions—but at least you can avoid deliberately making it worse.
It is a fact of life that we change our minds less often than we think. Once an idea gets into our heads, it is harder to get it out than we think. Only an extremely restrictive chain of reasoning, that definitely prohibited most possibilities from consideration, would be sufficient to undo this damage—to root an idea out of your head once it lodges. The less you know for sure, the easier it is to become contaminated—weak domain knowledge increases contamination effects.
It is a fact of life that we are far more likely to stop searching for further alternatives at a point when we have a conclusion we like, than when we have a conclusion we dislike.
It is a fact of life that we hold ideas we would like to believe, to a lower standard of proof than ideas we would like to disbelieve. In the former case we ask “Am I allowed to believe it?” and in the latter case ask “Am I forced to believe it?” If your domain knowledge is weak, you will not know enough for your own knowledge to grab you by the throat and tell you “You’re wrong! That can’t possibly be true!” You will find that you are allowed to believe it. You will search for plausible-sounding scenarios where your belief is true. If the search space of possibilities is large, you will almost certainly find some “winners”—your domain knowledge being too weak to definitely prohibit those scenarios.
It is a fact of history that the group selectionists failed to relinquish their folly. They found what they thought was a perfectly plausible way that evolution (evolution!) could end up producing foxes who voluntarily avoided reproductive opportunities(!). And the group selectionists did in fact cling to that hypothesis. That’s what happens in real life! Be warned!
To beat anthropomorphism you have to be scared of letting anthropomorphism make suggestions. You have to try to avoid being contaminated by anthropomorphism (to the best extent you can).
As soon as you let anthropomorphism generate the idea and ask, “Could it be true?” then your brain has already swapped out of forward-extrapolation mode and into backward-rationalization mode. Traditional Rationality contains inadequate warnings against this, IMO. See in particular the post where I argue against the Traditional interpretation of Devil’s Advocacy.
Yes, there are occasions when you want to perform abductive inference, such as when you have evidence that something is true and you are asking how it could be true. We call that “Bayesian updating”, in fact. An occasion where you don’t have any evidence but your brain has made a cute little anthropomorphic suggestion, is not a time to start wondering how it could be true. Especially if the search space of possibilities is large, and your domain knowledge is too weak to prohibit plausible-sounding scenarios. Then your prediction ends up being determined by anthropomorphism. If the real process is not controlled by a brain similar to yours, this is not a good thing for your predictive accuracy.
This is a war I wage primarily on the battleground of Unfriendly AI, but it seems to me that many of the conclusions apply to optimism in general.
How did the idea first come to you, that the subprime meltdown wouldn’t decrease the value of your investment in Danish deuterium derivatives? Were you just thinking neutrally about the course of financial events, trying to extrapolate some of the many different ways that one financial billiard ball could ricochet off another? Even this method tends to be subject to optimism; if we know which way we want each step to go, we tend to visualize it going that way. But better that, than starting with a pure hope—an outcome generated because it ranked high in your preference ordering—and then permitting your mind to invent plausible-sounding reasons it might happen. This is just rushing to failure.
And to spell out the application to Unfriendly AI: You’ve got various people insisting that an arbitrary mind, including an expected paperclip maximizer, would do various nice things or obey various comforting conditions: “Keep humans around, because diversity is important to creativity, and the humans will provide a different point of view.” Now you might want to seriously ask if, even granting that premise, you’d be kept in a nice house with air conditioning; or kept in a tiny cell with life support tubes and regular electric shocks if you didn’t generate enough interesting ideas that day (and of course you wouldn’t be allowed to die); or uploaded to a very small computer somewhere, and restarted every couple of years. No, let me guess, you’ll be more productive if you’re happy. So it’s clear why you want that to be the argument; but unlike you, the paperclip maximizer is not frantically searching for a reason not to torture you.
Sorry, the whole scenario is still around as unlikely as your carefully picking up ants on the sidewalk, rather than stepping on them, and keeping them in a happy ant colony for the sole express purpose of suggesting blog comments. There are reasons in my goal system to keep sentient beings alive, even if they aren’t “useful” at the moment. But from the perspective of a Bayesian superintelligence whose only terminal value is paperclips, it is not an optimal use of matter and energy toward the instrumental value of producing diverse and creative ideas for making paperclips, to keep around six billion highly similar human brains. Unlike you, the paperclip maximizer doesn’t start out knowing it wants that to be the conclusion.
Your brain starts out knowing that it wants humanity to live, and so it starts trying to come up with arguments for why that is a perfectly reasonable thing for a paperclip maximizer to do. But the paperclip maximizer itself would not start from the conclusion that it wanted humanity to live, and reason backward. It would just try to make paperclips. It wouldn’t stop, the way your own mind tends to stop, if it did find one argument for keeping humans alive; instead it would go on searching for an even superior alternative, some way to use the same resources to greater effect. Maybe you just want to keep 20 humans and randomly perturb their brain states a lot.
If you can’t blind your eyes to human goals and just think about the paperclips, you can’t understand what the goal of making paperclips implies. It’s like expecting kind and merciful results from natural selection, which lets old elephants starve to death when they run out of teeth.
A priori, if you want a nice result that takes 10 bits to specify, then a priori you should expect a 1/1024 probability of finding that some unrelated process generates that nice result. And a genuinely nice outcome in a large outcome space takes a lot more information than the English word “nice”, because what we consider a good outcome has many components of value. It’s extremely suspicious if you start out with a nice result in mind, search for a plausible reason that a not-inherently-nice process would generate it, and, by golly, find an amazing clever argument.
And the more complexity you add to your requirements—humans not only have to survive, but have to survive under what we would consider good living conditions, etc. - the less you should expect, a priori, a non-nice process to generate it. The less you should expect to, amazingly, find a genuine valid reason why the non-nice process happens to do what you want. And the more suspicious you should be, if you find a clever-sounding argument why this should be the case. To expect this to happen with non-trivial probability is pulling information from nowhere; a blind arrow is hitting the center of a small target. Are you sure it’s wise to even search for such possibilities? Your chance of deceiving yourself is far greater than the a priori chance of a good outcome, especially if your domain knowledge is too weak to definitely rule out possibilities.
No more than you can guess a lottery ticket, should you expect a process not shaped by human niceness, to produce nice results in a large outcome space. You may not know the domain very well, but you can understand that, a priori, “nice” results require specific complexity to happen for no reason, and complex specific miracles are rare.
I wish I could tell people: “Stop! Stop right there! You defeated yourself the moment you knew what you wanted! You need to throw away your thoughts and start over with a neutral forward extrapolation, not seeking any particular outcome.” But the inferential distance is too great; and then begins the slog of, “I don’t see why that couldn’t happen” and “I don’t think you’ve proven my idea is wrong.”
It’s Unfriendly superintelligence that tends to worry me most, of course. But I do think the point generalizes to quite a lot of optimism. You may know what you want, but Nature doesn’t care.
- On coincidences and Bayesian reasoning, as applied to the origins of COVID-19 by 19 Feb 2024 1:14 UTC; 62 points) (
- [SEQ RERUN] Contaminated by Optimism by 23 Jul 2012 3:36 UTC; 9 points) (
- 10 Aug 2010 16:53 UTC; 6 points) 's comment on Five-minute rationality techniques by (
- 24 Jul 2014 13:56 UTC; 6 points) 's comment on Alpha Mail by (
- 2 Dec 2009 16:43 UTC; 0 points) 's comment on The Moral Status of Independent Identical Copies by (
- 8 Aug 2008 16:19 UTC; 0 points) 's comment on Morality as Fixed Computation by (
In my darker moments, I think that every human political tendency is just an instance of this very problem. Terry Pratchett (our most underrated explicator and critic of both Traditional Rationality and Bourgeois Morality) described it most pithily as “Wouldn’t It Be Nice, If Everyone Was Nice”. It’s most obvious on the left, but can be seen on the right and libertarian tendencies as well.
[deleted]
Caledonian,
There was selection at the level of the group, according to the standard definitions. If you’re going to make confrontational non-substantive comments on every post, please at least try to know what you’re talking about.
“Sorry, you’re not allowed to suggest ideas using that method” is not something you hear, under Traditional Rationality.
But it is a fact of life, ….
It is a fact of life that ….
I disagree. You list a whole collection of mistakes people make after they have a bad hypothesis that they’re attached to. I say, the mistake should not be to use your prior experience when you come up with hypotheses. The mistakes are first to get too attached to one hypothesis, followed by the list of “facts of life” mistakes you then described.
People will get hypotheses from whatever source. Telling them they shouldn’t have done it is like telling a jury to completely ignore evidence that they should not have been exposed to. Very hard commands to obey.
What’s needed is a new skill. In mathematics, I found it useful when I had trouble proving a theorem to try to disprove it instead. The places I had trouble with the theorem gave great hints toward building a counterexample. Then if the counterexample kept running into problems, the kind of problems it ran into gave great hints toward how to solve the theorem. Then if I still couldn’t prove it, the problems pointed toward a better counterexample, and so on.
So for things like evolutionary questions, once you have an idea about a way that evolution might work to get a particular result, the needed skill might be to look for any way that other genes could be selected while subverting that process. If you can be honest enough that you see it’s easy for it to fail and hard for it to succeed, then the proposed mechanism gets a lot easier to reject.
This logic applied to SDI, for example. The argument wasn’t whether we could build the advanced technology required to shoot down an ICBM. The argument was whether we could improve SDI as fast as our potential enemies could improve their SDI-blocking methods. And we clearly could not.
The question isn’t “Can it work?”. The question is “Can it outcompete all comers?”. Group selection advocates got caught up in the question whether there are circumstances that allow group selection. Yes, there are. It can happen. But then there’s the question how often those circumstances show up, and currently the answer looks like “rarely”.
[deleted]
To get this whole line of reasoning off the ground, you need a decent way to rank phenomena in terms of how similar they are to us. Given this ranking, the warning is to beware of treating low ranked items like high ranked items. On AI, you need to give an argument why AI is a low ranked item, i.e., why AI is especially unlike us.
Now that you mention it, I’m actually not 100% sure that a paperclip maximizer wouldn’t give humans some fraction of computing resources as some sort of very cheap game theory move.
Oh, and my ants say they’re offended.
I guess “game theory move” doesn’t make much sense; it should have read “given the possibility that it’s being simulated”.
Hardly any phenomena are like us, though. You can’t hold a conversation with gentrification, or teach nitrogen narcosis to play piano.
It strikes me that if you want to rank phenomena as to how like us they are, you have a bunch of humans with gigantic numbers, and then chimps and chatbots rolling around at about .1, and then a bunch of numbers small enough you want scientific notation to express them.
@steven:
Do you devote a significant amount of your time and resources to making paperclips, given the possibility that you’re being simulated? If not, why would a paperclip-maximizer devote time and resources to human life?
steven: your “not 100% sure” is a perfect example of the problem eliezer is trying to explain. “not 100% sure that X is false” is not a valid excuse to waste thought on X if the prior improbability of X is as incredibly tiny as it is for thoughts like “paperclip maximizers will find their own paperclip-related reasons not to murder everyone”.
This is something that’s bothered me a lot about the free market. Many people, often including myself, believe that a bunch of companies which are profit-maximizers (plus some simple laws against use of force) will cause “nice” results. These people believe the effect is so strong that no possible policy directly aimed at niceness will succeed as well as the profit-maximization strategy does. There seems to be a lot of evidence for this. But it also seems too easy, as if you could take ten paper-clip maximizers competing to convert things into differently colored paperclips, and ended out with utopia. It must have something to do with capitalism including a term for the human utility function in the form of demand, but it still seems miraculous.
First, policies don’t aim, actors with intent do. A journalistic peeve of mine. Newspaper writers generally spend the first 10 paragraphs of a story about legislation psycho analyzing the intent of pieces of paper, and rarely will tell you what the pieces of paper actually say.
Second, I don’t consider this a serious pro free market position. It’s not that no “possible” government enforced policy would do better, it’s that the political process is generally unlikely to yield a better policy.
Unfortunately, many people who hold this position don’t know that it’s not serious.
I don’t think that’s an accurate characterization of Austrian economists.
Well, I think it’s a quite accurate depiction of anyone who uses phrases like “a priori science.”
(That is, to the extent that Austrian economics is based on a priori reasoning, various claims about types of government intervention really are claims that no possible such government intervention could ever be good for people)
A priori claims can be probabilistic claims.
Are you aware of a broad tradition of such probabilities that I’m completely unaware of?
It’s really not at all mysterious if you understand the math. Much like how evolution can miraculously create complex life by maximizing “fitness” (i.e. offspring).
Also, when you study the math, you will see the many assumptions that make the result go through. Much like evolution, it doesn’t always turn out. Markets are stupid.
I just googled to find a decent example of the math and this (pdf) is what I came up with. Looks pretty good, but there are many versions of this material available online.
I just realized I responded to a very old comment, which I was lead to by its being the parent of a comment made today. Sigh. Well, hopefully someone finds the link above useful.
This posting, like many of Eliezer’s, offers good advice for someone who is advancing the frontiers of human knowledge and breaking new ground. But most of us aren’t in that situation. Do these considerations offer useful insights for the average person living his life? Or are they just abstract philosophy without practical import for most people?
It seems to me like the simplest way to solve friendliness is: “Ok AI, I’m friendly so do what I tell you to do and confirm with me before taking any action.” It is much simpler to program a goal system that responds to direct commands than to somehow try to infuse ‘friendliness’ into the AI. Granted, marketing wise a ‘friendliness’ infused AI sounds better because it makes those who seek to build such AI seem altruistic. Anyone saying they intend to implement the former seems selfish and power hungry.
Wasn’t that trick tried with Windows Vista, and people were so annoyed by continually being asked trivial “can I do this?” questions that they turned off the security?
I would say yes, the overcoming bias project is useful for laypeople—it’s changed my life, at least. I don’t read these posts as being about becoming a great scientist. I realize the irony of quoting “TMoLFAQ” in the comments to this post, but “[t]here’s no such thing as science”: rationality is about forming beliefs that are actually true, and plans that might actually work. Advancing the frontiers is a special case.
Give utility-maximizers long enough to develop, and they’ll develop increasingly-sophisticated strategies for dealing with each other. ‘Nice’ has nothing to do with it.
As for the main thrust of this post—normal rationality is self-correcting, but it’s good not to be able to make the mistake in the first place. But minimize one type of error, increase the possibility of others. Always take the sure course and you’ll miss opportunities; never take risks, and you’ll accomplish far less than you possibly could. Whether that’s a good thing depends on how serious screwing up is.
No, let me guess, you’ll be more productive if you’re happy Deja vu.
Caledonian, HA has been discussing other superorganisms as possibly being conscious which Eliezer says evolution does not apply to (both could be right). What’s your opinion on such matters?
No, let me guess, you’ll be more productive if you’re happy Deja vu.
Caledonian, HA has been discussing other superorganisms as possibly being conscious which Eliezer says evolution does not apply to (both could be right). What’s your opinion on such matters?
No, let me guess, you’ll be more productive if you’re happy Deja vu.
Caledonian, HA has been discussing other superorganisms as possibly being conscious which Eliezer says evolution does not apply to (both could be right). What’s your opinion on such matters?
A paperclip maximizer might keep humans around for a while (because, as of right now, we’re the only beings around that make paperclips) but yeah, if it had enough power (magic nanotechnology, etc.), we’d most likely be gone.
This has been addressed before. Basically, you’ll get what you asked for, but it probably won’t be what you really want.
Do you devote a significant amount of your time and resources to making paperclips, given the possibility that you’re being simulated?
Keeping everyone alive would not take a significant amount of a paperclip maximizer’s time and resources. (Though for utilitarians this probably means it doesn’t count.) But the key difference is this: human-like goal systems seem like they will gain access to lots more simulation resources than paperclip maximizers or some other specific human-indifferent goal system (the set of all human-indifferent goal systems together is a different matter, but they’re not a coherent bloc).
Caledonian, Those conditions weren’t created in the laboratory, because the individual strategy dominated over the group; ergo, the conditions necessary for that to happen were not met.
I am not sure how to interpret this. When we remove whole groups from the gene pool because of some group characteristic (i.e. averaged over the population of that group), it sounds for me natural to call that a group selection. Do you have some different meaningful definition of group selection? What does it mean in general when you say that the individual strategy dominates?
If the AI receives commands frequently it AI would be weak—and probably not very competitive. It would be like a child running to its mummy all the time. To make decisions fast, that sort of thing is not on the cards.
If the AI receives commands infrequently, that’s more-or-less what is under discussion.
However, AIs can be expected to naturally defend their goals. It may be best not to provide a convenient interface for changing them—since it could also be used to hijack the AI. That’s especially true if the AI is deployed into “uncertain” territory—e.g. as a consumer robot’s brain. We wouldn’t want consumers to be able to reprogram the AI to kill people—that would not reflect well on the robot company’s image.
I have a question for you Eliezer. When you were figuring out how powerful AIs made from silicon were likely to be, did you have a goal that you wanted? Do you want AI to be powerful so it can stop death?
Good comment. I would really like to hear an answer to this.
[misrepresentation deleted] Evolution will continue. But as the substrate of human-memetic organisms is so flexible, this actually limits their ability to pass on data in a predictable way. Their life is more like fire than ours is; possibly they reflect what biological life may have been like at its beginning, before DNA.
Changing the nature of inheritance changes the way evolution takes place, but it doesn’t alter the evolution itself: the accumulation of persistent traits in a population across time.
Eliezer: You’ve got various people insisting that an arbitrary mind, including an expected paperclip maximizer, would do various nice things or obey various comforting conditions: “Keep humans around, because diversity is important to creativity, and the humans will provide a different point of view.” Now you might want to seriously ask if, even granting that premise, you’d be kept in a nice house with air conditioning; or kept in a tiny cell with life support tubes and regular electric shocks if you didn’t generate enough interesting ideas that day (and of course you wouldn’t be allowed to die);
this seems unlikely to me. Humans are only creative under certain conditions, and the conditions you have described seem to be conducive to turning potentially creative humans into useless lumps of flesh who are insane.
or uploaded to a very small computer somewhere,
this is, in my opinion, a very likely outcome. Though one can argue about what “small” means.
and restarted every couple of years. No, let me guess, you’ll be more productive if you’re happy. So it’s clear why you want that to be the argument; but unlike you, the paperclip maximizer is not frantically searching for a reason not to torture you.
Sorry, the whole scenario is still around as unlikely as your carefully picking up ants on the sidewalk, rather than stepping on them, and keeping them in a happy ant colony for the sole express purpose of suggesting blog comments. There are reasons in my goal system to keep sentient beings alive, even if they aren’t “useful” at the moment. But from the perspective of a Bayesian superintelligence whose only terminal value is paperclips, it is not an optimal use of matter and energy toward the instrumental value of producing diverse and creative ideas for making paperclips, to keep around six billion highly similar human brains.
I don’t think that having six billion highly similar human brains is a good thing, so in this sense I am with the paperclip maximizer. Look at all the boring, generic, average lives that are lived today. Our confinement to human bodies is not a good thing as far as I am concerned. So I’m not taking the world as it is today, and arguing that Universal Instrumental Values will keep it exactly the way it is.
The reason I got interested in UIVs to start with is that I didn’t have a good way to decide what counted as a good outcome.
Saying that grey goo will spread, and then never change or create new forms, is as mistaken as saying that single-celled organisms should never have given rise to multi-cellular organisms because competition between individuals is so stringent.
One of the things that Eliezer doesn’t grasp is that optimization is not something evolution has generally produced because optimization is often maladaptive. What would appear to be an ideal strategy in the short term fails in the long term because overall environmental conditions have a tendency to change. Biologists have long noted that specialization offers great benefits to organisms, but often results in their eventual extinction.
Saying that the tendency of hypothetical nanomachines, or actual corporations, will be to find an optimum configuration and remain that way, is like saying that Always Defectors will dominate games of the Repeated Prisoners’ Dilemma. It’s what the most superficial, summing-over analysis of the problem will indicate, but it misses more subtle points that change everything. There’s a reason Always Defectors don’t dominate populations in the game or in real life.
Why, yes, I do think that has something to do with why the market builds houses with air conditioning instead of tiny little cells.
Well, this particular abstract philosophy could end up having a pretty large practical import for all people, if they end up reprocessed into paperclips. But to answer the intent of your question, hence the whole extension to general optimism as a special case of anthropomorphism.
Name me any high-ranked item that does not share causal parentage with a human. Chimps, for example, are worthy objects of anthropomorphism—and 95% genetically similar to us due to common ancestry.
I think I was pretty much raised believing in the intelligence explosion (i.e. read “Great Mambo Chicken and the Transhuman Condition” before puberty). As a teenager I thought it was likely that AIs would be able to violate what our civilization believes to be the laws of physics, and e.g. enable interstellar travel at FTL speeds. As I grew up and my knowledge became more constraining, and intelligence began to seem less like magic and more like a phenomenon within physics, it became much less absurd to think that an SI might still be constrained by the lightspeed limit we know—especially given the Fermi Paradox. (Of course I do still assign a fair probability that we are very far from knowing the final laws of physics—I would bet at >50% on an SI being able to accomplish in practice at least one thing we deem “physically impossible”.)
If you live with physics as we know it, that does seem to imply no immortality—just living for a very long time, and then dying. Though I still hold out hope.
So the answer to your question’s intent is essentially “Yes” on both counts; and I have grown less confident of my hopes, and less awed, over time. But such trivial and physically possible deeds as building molecular nanotechnology, or thinking a million times as fast as a human, I am still fairly confident about.
It is; the question is whether such group selection can overcome a countervailing individual selection pressure. Mathematically, this requires group selection pressure to be extremely strong, or individual selection pressures to be very weak, or both.
A “group-selected” characteristic would be one produced by selection on the level of groups, such as cannibalism in Michael Wade’s experiment. Not a characteristic that is “nice toward the group” according to a sense of human aesthetics. Although cannibalism does help the group, if high-population groups are regularly eliminated. And in fact this characteristic was produced by group selection; it was a group-fitness-increasing adaptation for population control. Cannibalism from individual selection pressures was much weaker, in the control groups. It’s just not the way that you or I would think of helping.
[Deleted. Caledonian, whenever you say something poorly reasoned or that misrepresents others’ arguments, I am going to treat it as deliberate trolling and excise it. I’ve seen you do better, just keep it consistent.]
I said: “The reason I got interested in UIVs to start with is that I didn’t have a good way to decide what counted as a good outcome.”
So I have realized that perhaps the best prophylactic against anthropic optimism bias is, in fact, to be genuinely unsure of what outcome you think is best. If you don’t have a predetermined idea of what to argue in favor of, then you don’t have a preferred outcome to argue in favor of. Admittedly this is not something that one can always do, but in the specific case of trying to work out what kind of goals to program an AI with, I think it is possible. Of course there are some outcomes that I don’t want, like a universe full of nothing but paperclips. But for the vast majority of “interesting” outcomes, I am still looking for a way to decide.
TGGP: “No, let me guess, you’ll be more productive if you’re happy Deja vu.”—thanks for the link! So, it seems that yet another person has had roughly the same idea as me...
TGGP, please see ni.codem.us for a brief response to your question that Eliezer will not permit.
To return to the implications of things Eliezer has said recently, please consider this
in the context of Eliezer’s search for ‘Friendly AI’.Eliezer, sharing causal parentage with us sounds like a plausible heuristic for ranking things in terms of similarity to us, but in many important senses an AI could share a great deal of causal parentage with us. So you still need a more detailed argument to rank AI low.
Eliezer:
Personally, I am not disputing the importance of friendliness. My question is, what do you think I should do about it?
If I were an AI expert, I would not be reading this blog since there is clearly very little technical content here.
My time would be simply too valuable to waste reading or writing popular futurism.
I certainly wouldn’t post everyday, just to recapitulate the same material with minor variations (basically just killing time).
Of course, I’m not an expert… but you are. So instead of preaching the end of the world, why aren’t you frantically searching for a way to defer it?
Unless, perhaps, you have given up?
I second spindizzy, yet hope that something major is happening.
Which AI? A Friendly AI shares goal-systemic shape with us due to a direct causal link: humans successfully shaping the FAI. The law against deciding what you want, applies when you don’t have control over the outcome—for intermediate cases, where you have partial control, you have to switch on decide-what-you-want for the decision nodes and switch it off for everything else. You can be optimistic about FAI to the extent you expect nice humans to exert successful shaping influence on it.
Even so, I’ve got to say that I don’t think an FAI would be one-tenth as anthropomorphic as most “AIs” depicted in even high-class science fiction; if an FAI had any sort of understandable appearance it would be because that was what we needed, not by its nature.
I don’t want to play burden-of-proof tennis, but I’m not sure how to avoid it in a case like this; human-nice outcomes tend to occur when human-nice goal systems are at work shaping it. The causal link seems obvious enough. In the presence of a paperclip-maximizer this causal link is deleted and the outcome goes back to default-alien. If you say that, even in the absence of a human-nice goal system, the link to nice outcomes is maintained, and you don’t say why, and you demand that I prove it can’t happen, I’m not sure what else to say, except for the obvious: you are made of materials that can be used to make paperclips. If this isn’t what you’re saying, and it probably isn’t, I’m afraid you’ll have to spell it out in more detail.
Spindizzy and sophiesdad, I’ve spent quite a while ramming headlong into the problem of preventing the end of the world. Doing things the obvious way has a great deal to be said for it; but it’s been slow going, and some help would be nice. Research help, in particular, seems to me to probably require someone to read all this stuff at the age of 15 and then study on their own for 7 years after that, so I figured I’d better get started on the writing now.
Perhaps we can use this defense of theory instinct as a simplified map of what we want the AI to do.
So create a paperclip maximizer that is (handwave) somehow restricted from doing anything that it’s creator would try to convince people it would never do.
This is assuming that how we steer ourselves away from horrid ideas is simpler than how we decide we like something.
@Eliezer Yudkowsky said: Spindizzy and sophiesdad, I’ve spent quite a while ramming headlong into the problem of preventing the end of the world. Doing things the obvious way has a great deal to be said for it; but it’s been slow going, and some help would be nice. Research help, in particular, seems to me to probably require someone to read all this stuff at the age of 15 and then study on their own for 7 years after that, so I figured I’d better get started on the writing now.
I have posted this before without answer, but I’ll try again. You are working alone while seeking an “assistant” who may currently be 15, offering subsistence wages (see Singularity Institute) . While I am aware that there is great fear of government involvement in FAI, governments could bring together a new “Manhattan Project” of the greatest minds in the fields of engineering, mathematics, and computer science. If you alone knew how to handle all the aspects, you would already be doing it, instead of thinking about it. Surely you must believe that a good result is more important than personal credit for “preventing the end of the world”? Einstein himself did not believe that the power of the atom could be harnessed by humans, yet when he was combined with a TEAM, it was done in three years.
Secondly, you have no way to know what other governments may be doing in clandestine AI research. Why not strive to be first? Yes, I’m well aware that the initial outcome of the Manhattan Project was far from “friendly”, but my cousin in the CIA tells me the US govt has first refusal on any technology that might be vital to national security. I think AI qualifies, so why not use their resources and power? Someone will, and they might not not have your “friendly” persuasions.
James: See The Hidden Complexity of Wishes. You can’t think of everything you’d need to ban.
sophiesdad: I don’t see convincing anyone who matters in the government of the potential of AGI, let alone the need for Friendliness, as very likely. Tim Kyger (DOD employee) says “I don’t know a soul in DoD or any of the services off the top of my head that has any inkling of the very existence of trans-H or of the various technical/scientific lanes of approach that are leading to a trans/post-human future of some sort. Zip. Zero. Nada.” If you could get the government involved, it would most likely make things much worse. Paraphrasing Eliezer(2002), you’d be a lot less likely to get a positive outcome than an outright ban on AGI, an utterly unworkable set of regulations designed and enforced by people who don’t understand the problem, or a Manhattan Project run by people who don’t understand the problem and virtually guaranteed to destroy the world.
Eliezer,
You should either: a) ban Caledonian; b) let him write whatever he wants.
Censoring his posts is kind of nasty, because it looks like he can only express opinions you think worth posting. Personally, I think you should choose (a), because his comments are boring, disruptive and useless, but if you don’t wanna do it, then go for (b).
And as for this: “Research help, in particular, seems to me to probably require someone to read all this stuff at the age of 15 and then study on their own for 7 years after that, so I figured I’d better get started on the writing now”, I think it’s kinda dumb, and it will never work out. If you keep hoping on something like that, somebody will get there first, and it probably won’t be a Friendly outcome.
So you should plow ahead, and perhaps not be so arrogant as to think that no one else on the planet right now can help you with the research. There’re plenty of smart guys out there, and if they have access to the proper literature, I’m sure you can find worthy contributors, instead of waiting 7 more years.
As was pointed out, this might not have the consequences one wants. However, even if that wasn’t true, I’d still be leery of this option—this’d effectively be giving one human unlimited power. History has shown that people who are given unlimited power (or something close to it) tend to easily misuse it, even if they started out with good intentions.
I wanted to ban him, but other commenters requested that he be allowed to stay. So I haven’t banned him, but I’m not going to let his trollings take over the comment threads either. Caledonian can write passable comments when he puts his mind to it; and if that’s all that’s allowed through, he has no motive to write anything else.
Whoever is censoring Caledonian: can it be done without adding the content-free nastiness (such as “bizarre objection”, “illogic”, and “gibberish”)?
Pyramid Head:
If you know of a good way to find such people beyond that what SIAI is already doing, do go ahead and tell us.
This idea may be contaminated by optimism, but to avoid the risk of destroying humanity with AI, would it not be sufficient to make the AI more or less impotent? If it were essentially a brain in a jar type of thing that showcased everything humanity could create in terms of intelligence without the disastrous options of writing its own code or having access to a factory for creating death-bots? I suppose this is also anthromorphizing the AI because if it were really that super-intelligent it could come up with a way to do its optimization beyond the constraints we think we are imposing. Surely building a tooth-less though possibly “un-Friendly” AI is a more attainable goal than building an unrestricted Friendly AI?
Constant: Done.
Boris: See That Alien Message.
As for the Manhattan Project, who do you think they’re going to pick to lead it? Some young unknown with a mad brilliant idea? Or, say, Roger Schank? (I’ve got nothing against him personally, but he’s pretty old-style.) Japan tried something like this with their Fifth Generation project. Didn’t help them any.
When the basic theory is done, then I’ll know if implementation requires a Manhattan Project or not. I don’t think it will. AI done right is not about brute force.
Eliezer, I agree SF fiction writers find it far easier to just write AIs, and also true aliens, as just odd humans, as it is far more work to write a plausible intelligent non-human. I don’t mean to be saying much more than the obvious point that an AI that was a “mind child” of human civilization would in that obvious sense share causal parentage with us. A whole brain emulation would have started out as a particular human brain, a hard coded AI would have started out as the sort of code a human would write, and an AI that evolved under pressure to be useful would have evolved to be useful in our human-dominated world, rather that some random world out there. I don’t at all claim that mind children of ours would be particularly “nice”; creatures that share lots of our heritage can obviously be quite unnice.
‘It seems to me like the simplest way to solve friendliness is: “Ok AI, I’m friendly so do what I tell you to do and confirm with me before taking any action.” It is much simpler to program a goal system that responds to direct commands than to somehow try to infuse ‘friendliness’ into the AI.′
As was pointed out, this might not have the consequences one wants. However, even if that wasn’t true, I’d still be leery of this option—this’d effectively be giving one human unlimited power.
Would you expect all the AIs to work together under one person’s direction? Wouldn’t they group up different ways, and work with different people?
In that case, the problem of how to get AIs to be nice and avoid doing things that hurt people boils down to the old problem of how to get people to be nice and avoid doing things that hurt people. The people might have AIs to tell them about some of the consequences of their behavior, which would be an improvement. “But you never told us that humans don’t want small quantities of plutonium in their food. This changes everything.”
But if it just turns into multiple people having large amounts of power, then we need those people not to declare war on each other, and not crush the defenseless, and so on. Just like now except they’d have AIs working for them.
Would it help if we designed Friendly People? We’d need to design society so that Friendly People outcompeted Unfriendly People....
I think you sidestepped the point as it related to your post. Are you’re rationally taking into account the biasing effect your heartfelt hopes exert on the set of hypotheses raised to your concious attention as you conspire to save the world?
There’s really no way to eliminate one’s own biases without recourse to objective and impersonal tests, of course. You can’t identify your own mistakes with armchair theorizing. But you have to give him credit for effort.
Indeed. But you can, for example, say “I wish fast takeoff to be possible, so should be less impressed, all else equal, by the number of hypothesis I can think of that happen to support it”.
Do you wish fast takeoff to be possible? Aren’t then Very Horrid Singularities more likely?
Yes, but even then the ballot stuffing is still going on beneath your awareness, right? Doesn’t that still count as some evidence for caution?
RI, the point is not in consciously reevaluating the hypotheses to oppose the bias (=reverse stupidity), but in avoiding the wrong conclusion reaching the level of awareness, in repairing the outlook before it misguides intuition. Intuition is relatively blind, but it is the engine of human intelligence. Feed it with misinterpreted information, and it will turn out gibberish. You can’t convert gibberish into sensible output, and you can’t make the engine produce correct output by hacking it with a hammer. You need to feed it quality fuel.
@Kaj Sotala: I can’t—I’m not smart enough :)
But seriously, do you really think that we ought to wait a decade before a brilliant researcher shows up? And it seems all the more suspicious because this brilliant researcher has to read Eliezer’s material in a tender age, or else he won’t be good enough.
Now don’t get me wrong, I love Eliezer’s posts here, and I’ve learned A LOT of stuff. And I also happen to think that he’s onto something when he talks about Friendly AI (and AI in general). But I don’t see how he can hope to save the world by writing blog posts...
@Pyramid Head: I don’t see how he can hope to save the world by writing blog posts...
Ditto. Autodidactism may be a superior approach for the education of certain individuals, but it allows the individual to avoid one element crucial to production: discipline. Mr. Yudkowsky’s approach, and his resistance to work with others, along with his views that it is his job to save the world, and no one else can do it, suggest an element of savantism. Hardly a quality one would want in a superhuman intelligence.
I, too, enjoy his writing, but the fact that he discovered 200+ year-old Bayesian Probability only seven years ago, and claims that everything he did before that is meaningless, shows the importance of the input of learned associates.
Eliezer, I truly want you, along with others fortunate enough to have the neuronal stuff that produces such uniquely gifted brains, to save me. Please get to work, or I’ll give you a spanking.
I disagree with the last 2 comments.
Eliezer’s priority has gradually shifted over the last 5 years or so from increasing his own knowledge to transmitting what he knows to others, which is exactly the behavior I would expect from someone with his stated goals who knows what he is doing.
Yes, he has suggested or implied many times that he expects to implement the intelligence explosion more or less by himself (and I do not like that) but ever since the Summer of AI his actions (particularly all the effort he has put into blogging and his references to 15-to-18-year-olds, which suggests that he has thought about the most effective audience to target with his blogging) strongly indicate that he understands that the best way for him to assist the singularitarian project at this time is to transmit what he knows to other.
The blog is exactly the choice of means of transmission of scientific knowledge I would expect from someone who knows what he is doing. Surely we can look past the fact the some crusty academics look down on the blog.
I know of no one who has been more effective than Eliezer over the last 8 years or so at transmitting knowledge to people with a high aptitude for math and science.
And the suggestion that Eliezer lacks discipline strikes me as extremely unlikely. Just because a person is extremely intelligent does not mean that it is easy for the person to acquire knowledge at the rate Eliezer has acquired knowledge or to become so effective at transmitting knowledge.
Suppose we break the problem down into multiple parts.
1. Understand how the problem works, what currently happens.
2. Find a better way that things can work, that would not generate the same problems.
3. Find a way to get from here to there.
4. Do it.
Then part 1 might easily be aided by a guy on a blog. Maybe part 2. Possibly part 3.
A blog is better than a newsgroup because the threads don’t scroll off, they’re all sitting on the site’s computer if anybody cares. Also, as old posts are replaced by new posts people stop responding to old posts. So there isn’t as much topic drift, we don’t get threads that last for two years with the same title and discuss two dozen different topics.
I tend to think a wiki would be even better. Or a wiki that lets people revise their own posts at will but not other people’s posts. Then rather than have long arguments where you finally come to some agreement, the arguments could turn into revisions of the original posts, and people could see at a glance where the remaining disagreement is. If you want to see how they got to where they are, you can look at the history of revisions.
None of this will change the world unless somebody pays attention. But it’s much easier to pay attention when there’s a coherent train of thought to pay attention to. And people who have a vague idea they want to flesh out, can do worse than present it on a blog and get people who want to help develop it or poke holes in it. Either or both.
It would seem that the big development in our lifetimes has been the advent of the digital computer, the Turing Machine. Assuming all humans come into the world with no basic knowledge other than hard-wired reflexes, we must all gain our knowledge from those who have have preceded us, along with our own reflections about that knowledge and reflections on our environmental observations. The entire Library of Congress is available digitally. Using the concepts of trend analysis and Bayesian Probabilities (and others I don’t know about), couldn’t a properly programed computer available right now analyze all human knowledge to this point, spotting trends, successful patterns, outcomes that improve the human condition, etc? Couldn’t it then use that knowledge to predict the future trends that would be beneficial? Isn’t that what Eliezer and others are trying to do, albeit in a much more awkward way? I think the computer would discover the importance of Bayesian Probability, if it is important, forthwith. If it is not recursively self-improving, it could at least approach the ideas that Eliezer and others are trying to describe for FAI in a much faster and more efficient way. N’est pas?
Well, is there really no one else in the world right now to work in this problem along with Eliezer (who, in my opinion, don’t lack discipline)? I can’t help but think that it’s rather arrogant...
Well, that’s one of the reasons I’m not a SIAI donor, though. Can’t donate money to someone who write blogs instead of researching Friendly AI theory. And I’m not nearly smart enough to make any progress on my own, or even help someone else. So I guess mankind is screwed :)
Retired urologist: “properly programed”
This is the hard part.
@sophiesdad: Autodidactism may be a superior approach for the education of certain individuals, but it allows the individual to avoid one element crucial to production: discipline.
@Pyramid Head: Eliezer (who, in my opinion, don’t lack discipline)
My comment about discipline was not meant to be inflammatory, nor even especially critical. Rather, it was meant to be descriptive of one aspect of autodidactism. In comparison, suppose that Mr. Yudkowsky was working toward his PhD at (say) University of Great Computer Scientists. His chosen topic for his dissertation is “Development of Friendly Artificial Intelligence, Superhuman”. After seven years, he reports to his advisers and shows his writings on Bayesian Probabilities, quantum mechanics, science fiction stories, pure fiction, and views of philosophical ideas widely published for centuries. They ask, “Where is your proposed design for FAI?” He would not receive his degree. Thomas Bayes described Bayesian Probablility adequately. Anyone who cannot understand his writings (me, for example) is not qualified to design an FAI, so the fact that Eliezer can help the common man do so is meaningless with regard to his “PhD work”. For the same reason, Mr. Yudkowsky’s wonderful series on quantum mechanics, which I have thoroughly enjoyed, is meaningless so far as advancing new knowledge or recruiting those with adequate brainage to work on FAI. It is entertaining, and particularly it is SELF-ENTERTAINING, but it is not reflective of the discipline necessary to accomplish the stated goal.
Of course, the answer to this is that disciplined, conventional educational and research techniques are what he is trying to avoid. He is right on schedule, but the technique is so brilliant in its conception that no one else can recognize it.
I don’t know you, Eliezer, and I will grant without knowing you that you are a far more special creation than I, and perhaps POTENTIALLY in the lineage of the Newtons, etc. But what if you have a fatal accident tomorrow? What if you have the recessive diseases associated with the high-intelligence Ashkenazy Jews and your life ends while you’re playing games? Will there be any record of what you did? Will someone else be able to stand on your shoulders? Will mankind be any closer to FAI?
Sophiesdad, you should be aware that I’m not likely to take your advice, or even take it seriously. You may as well stop wasting the effort.
@Eliezer: Sophiesdad, you should be aware that I’m not likely to take your advice, or even take it seriously. You may as well stop wasting the effort.
Noted. No more posts from me.
An unusually moderate and temperate exchange.
Eliezer- Have you written anything fictional or otherwise about how you envision an ideal post-fAI or post-singularity world? Care to share?
Oh… I should have read these comments to the end, somehow missed what you said to sophiesdad.
Eliezer… I am very disappointed. This is quite sad.
Well, heck. At least he’s being honest. Maybe a little blunt, but definitely honest.
Ok- Eliezer- you are just a human and therefore prone to anger and reaction to said anger, but you, in particular, have a professional responsibility not to come across as excluding people who disagree with you from the discussion and presenting yourself as the final destination of the proverbial buck. We are all in this together. I have only met you in person once, have only had a handful of conversations about you with people who actually know you, and have only been reading this blog for a few months, and yet I get a distinct impression that you have some sort of narcissistic Hero-God-Complex. I mean, what’s with dressing up in a robe and presenting yourself as the keeper of clandestine knowledge? Now, whether or not you actually feel this way, it is something you project and should endeavor not to, so that people (like sophiesdad) take your work more seriously. “Pyrimid head,” “Pirate King,” and “Emperor with no clothes” are NOT terms of endearment, and this might seem like a ridiculous admonission coming from a person who has self-presented as a ‘pretentious slut,’ but I’m trying to be provocative, not leaderly. YOU are asking all of these people to trust YOUR MIND with the dangers of fAI and the fate of the world and give you money for it! Sorry to hold you to such high standards, but if you present with a personality disorder any competent psychologist can identify, then this will be very hard for you… unless of course you want to go the “I’m the Messiah, abandon all and follow me!” route, set up the Church of Eliezer, and start a religious movement with which to get funding… Might work, but it will be hard to recruit serious scientists to work with you under those circumstances...
IMO, the last two posts, and especially this one, are some of the best on Less Wrong. It’s a pity this post isn’t included in any of the sequences.