You are evidently confused about what the word means. The systematic deletion of any content that relates to an idea that the person with power does not wish to be spoken is censorship in the same way that threatening to (probabilistically) destroy humanity is terrorism. As in, blatantly obviously—it’s just what the words happen to mean.
Going around saying ‘this isn’t censorship’ while doing it would trigger all sorts of ‘crazy cult’ warning bells.
Yes, the acts in question can easily be denoted by the terms “blackmail” and “censorship.” And your final sentence is certainly true as well.
To avoid being called a cult, to avoid being a cult, and to avoid doing bad things generally, we should stop the definition debate and focus on whether people’s behavior has been appropriate. If connotation conundrums keep you quarreling about terms, pick variables (e.g. “what EY did”=E and “what WFG precommitted to doing, and in fact did”=G) and keep talking.
There is still a moral sense in which if, after careful thought, I decided that that material should not have been posted, then any posts which resulted solely from my post are in a sense a violation of my desire to not have posted it. Especially if said posts operate under the illusion that my original post was censored rather than retracted.
But in reality such ideas tend to propagate like the imp of the perverse: a gnawing desire to know what the “censored” material is, even if everyone who knows what it is has subsequently decided that they wished they didn’t! E.g both me and Nesov have been persuaded (once fully filled in) that this is really nasty stuff and shouldn’t be let out. (correct me if I am wrong).
This “imp of the perverse” property is actually part of the reason why the original post is harmful. In a sense, this is an idea-virus which makes people who don’t yet have it want to have it, but as soon as they have been exposed to it, they (belatedly) realize they really didn’t want to know about it or spread it.
The only people who seem to be filled in are you and Yudkowsky. I think Nesov just argues against it based on some very weak belief. As far as I can tell, I got all the material in question. The only possible reason I can see for why one wouldn’t want to spread it is that its negative potential does outweigh its very-very-low-probability (and that only if you accept a long chain of previous beliefs). It doesn’t. It also isn’t some genuine and brilliant idea that all this mystery mongering makes it seem to be. Everyone I sent it just laughed about it. But maybe you can fill me in?
If the idea is dangerous in the first place (which is very unlikely), it is only dangerous to people who understand it, because understanding it makes you vulnerable. The better you understand it and the more you think about it, the more vulnerable you become. In hindsight, I would prefer to never have read about the idea in question.
I don’t think this is a big issue, considering the tiny probability that the scenario will ever occur, but I am glad that discussing it continues to be discouraged and would appreciate it if people stopped needlessly resurrecting it over and over again.
If the idea is dangerous in the first place (which is very unlikely), it is only dangerous to people who understand it, because understanding it makes you vulnerable.
This strikes me as tautological and/or confusing definitions. I’m happy to agree that the idea is dangerous to people who think it is dangerous, but I don’t think it’s dangerous and I think I understand it. To make an analogy, I understand the concept of hell but don’t think it’s dangerous, and so the concept of hell does not bother me. Does the fact that I do not have the born-again Christian’s fear of hell mean that they understand it and I don’t? I don’t see why it should.
I can’t figure out a way to explain this further without repropagating the idea, which I will not do. It is likely that there are one or more pieces of the idea which you are not familiar with or do not understand, and I envy your epistemological position.
Yes, but the concept of hell is easier to understand. From what I have read in the discussions, I have no idea how the Basilisk is supposed to work, while it’s quite easy to understand how hell is supposed to work.
I would prefer to never have read about the idea in question.
If you people are this worried about reality, why don’t you work to support creating a Paperclip maximizer? It would have a lot of fun doing what it wants to do and everyone else would quickly die. Nobody ever after would have to fear what could possible happen to them at some point.
If you people want to try to turn the universe into a better place, at whatever cost, then why do you worry or wish to not know about potential obstacles? Both is irrational.
The forbidden topic seems to be a dangerous Ugh field for a lot of people here. You have to decide what you want and then follow through on it. Any self-inflicted pain just adds to the overall negative.
The basilisk idea has no positive value. All it does is cause those who understand it to bear a very low probability of suffering incredible disutility at some point in the future. Explaining this idea to someone does them about as much good as slashing their tires.
The basilisk idea has no positive value. All it does is cause those who understand it to bear a very low probability of suffering incredible disutility at some point in the future.
I understand that but do not see that the description applies to the idea in question, insofar as it is in my opinion no more probable than fiction and that any likelihood is being outweighed by opposing ideas. There are however other well-founded ideas, free speech and transparency, that are being ignored. I also believe that people would benefit from talking about it and possible overcome and ignore it subsequently.
But I’m tired of discussing this topic and will do you the favor to shut up about it. But remember that I haven’t been the one who started this thread. It was Roko and whoever asked to delete Roko’s comment.
Look, you have three people all of whom think it is a bad idea to spread this. All are smart. Two initially thought it was OK to spread it.
Furthermore, I would add that I wish I had never learned about any of these ideas. In fact, I wish I had never come across the initial link on the internet that caused me to think about transhumanism and thereby about the singularity; I wish very strongly that my mind had never come across the tools to inflict such large amounts of potential self-harm with such small durations of inattention, uncautiousness and/or stupidity, even if it is all premultiplied by a small probability. (not a very small one, mind you. More like 1⁄500 type numbers here)
If this is not enough warning to make you stop wanting to know more, then you deserve what you get.
I wish I had never come across the initial link on the internet that caused me to think about transhumanism and thereby about the singularity;
I wish you’d talk to someone other than Yudkowsky about this. You don’t need anyone to harm you, you already seem to harm yourself. You indulge yourself in self-inflicted psychological stress. As Seneca said, “there are more things that terrify us than there are that oppress us, and we suffer more often in opinion than in reality”. You worry and pay interest for debt that will likely never be made.
Look, you have three people all of whom think it is a bad idea to spread this. All are smart.
I read about quite a few smart people who hold idiot beliefs, I only consider this to be marginal evidence.
Furthermore, I would add that I wish I had never learned about any of these ideas.
You’d rather be some ignorant pleasure maximizing device? For me truth is the most cherished good.
If this is not enough warning to make you stop wanting to know more, then you deserve what you get.
More so than not opening yourself up to a small risk of severe consequences? E.g. if you found a diary that clearly belonged to some organized crime boss, would you open it up and read it? I see this situation as analogous.
Or—if you don’t like that particular truth—care to say which truths you do like?
I can’t tell you, I cherry-pick what I want to know when it is hinted at. But generally most of all I want to know about truths that other agents don’t want me to know about.
There are thousands of truths I know that I don’t want you to know about. (Or, to be more precise, that I want you to not know about.) Are you really most interested in those, out of all the truths I know?
I think I’d be disturbed by that if I thought it were true.
But generally most of all I want to know about truths that other agents don’t want me to know about.
I’m not sure that’s a very good heuristic—are you sure that truly describes the truths you care most about? It seems analogous to the fact that people are more motivated by a cause if they learn some people opposes it, which is silly.
Heh—OK. Thanks for the reply. Yes, that is not that bad a heuristic! Maybe someday you can figure this out in more detail. It is surely good to know what you want.
I love this reply. I don’t think it’s necessarily the best reply, and I don’t really even think it’s a polite reply, but it’s certainly one of the funniest ones I’ve seen here.
Look, you have three people all of whom think it is a bad idea to spread this. All are smart. Two initially thought it was OK to spread it.
I see a lot more than three people here, most of whom are smart, and most of them think that Langford basilisks are fictional, and even if they aren’t, censoring them is the wrong thing to do. You can’t quarantine the internet, and so putting up warning signs makes more people fall into the pit.
I saw the original idea and the discussion around it, but I was (fortunately) under stress at the time and initially dismissed it as so implausible as to be unworthy of serious consideration. Given the reactions to it by Eliezer, Alicorn, and Roko, who seem very intelligent and know more about this topic than I do, I’m not so sure. I do know enough to say that, if the idea is something that should be taken seriously, it’s really serious. I can tell you that I am quite happy that the original posts are no longer present, because if they were I am moderately confident that I would want to go back and see if I could make more sense out of the matter, and if Eliezer, Alicorn, and Roko are right about this, making sense out of the matter would be seriously detrimental to my health.
Thankfully, either it’s a threat but I don’t understand it fully, in which case I’m safe, or it’s not a threat, in which case I’m also safe. But I am sufficiently concerned about the possibility that it’s a threat that I don’t understand fully but might be able to realize independently given enough thought that I’m consciously avoiding extended thought about this matter. I will respond to posts that directly relate to this one but am otherwise done with this topic—rest assured that, if you missed this one, you’re really quite all right for it!
Given the reactions to it by Eliezer, Alicorn, and Roko, who seem very intelligent and know more about this topic than I do, I’m not so sure.
This line of argument really bothers me. What does it mean for E, A, and R to seem very intelligent? As far as I can tell, the necessary conclusion is “I will believe a controversial statement of theirs without considering it.” When you word it like that, the standards are a lot higher than “seem very intelligent”, or at least narrower- you need to know their track record on decisions like this.
(The controversial statement is “you don’t want to know about X,” not X itself, by the way.)
I am willing to accept the idea that (intelligent) specialists in a field may know more about their field than nonspecialists and are therefore more qualified to evaluate matters related to their field than I.
Good point, though I would point out that you need E, A, and R to be specialists when it comes to how people react to X, not just X, and I would say there’s evidence that’s not true.
I agree, but I know what conclusion I would draw from the belief in question if I actually believed it, so the issue of their knowledge of how people react is largely immaterial to me in particular. I was mostly posting to provide a data point in favor of keeping the material off LW, not to attempt to dissolve the issue completely or anything.
When you word it like that, the standards are a lot higher than “seem very intelligent”, or at least narrower- you need to know their track record on decisions like this.
You don’t need any specific kind of proof, you already have some state of knowledge about correctness of such statements. There is no “standard of evidence” for forming a state of knowledge, it just may be that without the evidence that meets that “standard” you don’t expect to reach some level of certainty, or some level of stability of your state of knowledge (i.e. low expectation of changing your mind).
I have not only been warned, but I have stared the basilisk in the eyes, and I’m still here typing about it. In fact, I have only cared enough to do so because it was banned, and I wanted the information on how dangerous it was to judge the wisdom of the censorship.
On a more general note, being terrified of very unlikely terrible events is a known human failure mode. Perhaps it would be more effective at improving human rationality to expose people to ideas like this with the sole purpose of overcoming that sort of terror?
I’ll just second that I also read it a while back (though after it was censored) and thought that it was quite interesting but wrong on multiple levels. Not ‘probably wrong’ but wrong like an invalid logic proof is wrong (though of course I am not 100% certain of anything). My main concern about the censorship is that not talking about what was wrong with the argument will allow the proliferation of the reasoning errors that left people thinking the conclusion was plausible. There is a kind of self-fulfilling prophesy involved in not recognizing these errors which is particularly worrying.
1. Let x = y
2. x^2 = x*y
3. x^2 - y^2 = x*y - y^2
4. (x - y)*(x + y) = y*(x - y)
5. x + y = y
6. y + y = y (substitute using 1)
7. 2y = y
8. 2 = 1
You could refute this by pointing out that step (5) involved division by (x—y) = (y—y) = 0, and you can’t divide by 0.
But imagine if someone claimed that the proof is invalid because “you can’t represent numbers with letters like ‘x’ and ‘y’”. You would think that they don’t understand what is actually wrong with it, or why someone might mistakenly believe it. This is basically my reaction to everyone I have seen oppose the censorship because of some argument they present that the idea is wrong and no one would believe it.
I’m actually not sure if I understand your point. Either it is a round-about way of making it or I’m totally dense and the idea really is dangerous (or some third option).
It’s not that the idea is wrong and no one would believe it, it’s that the idea is wrong and when presented with with the explanation for why it’s wrong no one should believe it. In addition, it’s kind of important that people understand why it’s wrong. I’m sympathetic to people with different minds that might have adverse reactions to things I don’t but the solution to that is to warn them off, not censor the topics entirely.
This is a politically reinforced heuristic that does not work for this problem.
Transparency is very important regarding people and organisations in powerful and unique positions. The way they act and what they claim in public is weak evidence in support of their honesty. To claim that they have to censor certain information in the name of the greater public good, and to fortify the decision based on their public reputation, does bear no evidence about their true objectives. The only way to solve this issue is by means of transparency.
Surely transparency might have negative consequences, but it mustn’t and can outweigh the potential risks from just believing that certain people are telling the truth and do not engage in deception to follow through on their true objectives.
There is also nothing that Yudkowsky has ever achieved that would sufficiently prove his superior intellect that would in turn justify people to just believe him about some extraordinary claim.
When I say something is a misapplied politically reinforced heuristic, you only reinforce my point by making fully general political arguments that it is always right.
Censorship is not the most evil thing in the universe. The consequences of transparency are allowed to be worse than censorship. Deal with it.
When I say something is a misapplied politically reinforced heuristic, you only reinforce my point by making fully general political arguments that it is always right.
I already had Anna Salamon telling me something about politics. You sound as incomprehensible to me. Sorry, not meant as an attack.
Censorship is not the most evil thing in the universe. The consequences of transparency are allowed to be worse than censorship. Deal with it.
I stated several times in the past that I am completely in favor of censorship, I have no idea why you are telling me this.
Our rules and intuitions about free speech and censorship are based on the types of censorship we usually see in practice. Ordinarily, if someone is trying to censor a piece of information, then that information falls into one of two categories: either it’s information that would weaken them politically, by making others less likely to support them and more likely to support their opponents, or it’s information that would enable people to do something that they don’t want done.
People often try to censor information that makes people less likely to support them, and more likely to support their opponents. For example, many governments try to censor embarrassing facts (“the Purple Party takes bribes and kicks puppies!”), the fact that opposition exists (“the Pink Party will stop the puppy-kicking!”) and its strength (“you can join the Pink Party, there are 10^4 of us already!”), and organization of opposition (“the Pink Party rally is tomorrow!”). This is most obvious with political parties, but it happens anywhere people feel like there are “sides”—with religions (censorship of “blasphemy”) and with public policies (censoring climate change studies, reports from the Iraq and Afghan wars). Allowing censorship in this category is bad because it enables corruption, and leaves less-worthy groups in charge.
The second common instance of censorship is encouragement and instructions for doing things that certain people don’t want done. Examples include cryptography, how to break DRM, pornography, and bomb-making recipes. Banning these is bad if the capability is suppressed for a bad reason (cryptography enables dissent), if it’s entangled with other things (general-purpose chemistry applies to explosives), or if it requires infrastructure that can also be used for the first type of censorship (porn filters have been caught blocking politicians’ campaign sites).
These two cases cover 99.99% of the things we call “censorship”, and within these two categories, censorship is definitely bad, and usually worth opposing. It is normally safe to assume that if something is being censored, it is for one of these two reasons. There are gray areas—slander (when the speaker knows he’s lying and has malicious intent), and bomb-making recipes (when they’re advertised as such and not general-purpose chemistry), for example—but the law has the exceptions mapped out pretty accurately. (Slander gets you sued, bomb-making recipes get you surveilled.) This makes a solid foundation for the principle that censorship should be opposed.
However, that principle and the analysis supporting it apply only to censorship that falls within these two domains. When things fall outside these categories, we usually don’t call them censorship; for example, there is a widespread conspiracy among email and web site administrators to suppress ads for Viagra, but we don’t call that censorship, even though it meets every aspect of the definition except motive. If you happen to find a weird instance of censorship which doesn’t fall into either category, then you have to start over and derive an answer to whether censorship in that particular case is good or bad, from scratch, without resorting to generalities about censorship-in-general. Some of the arguments may still apply—for example, building a censorship-technology infrastructure is bad even if it’s only meant to be used on spam—but not all of them, and not with the same force.
If the usual arguments against censorship don’t apply, and we’re trying to figure out whether to censor it, the next two things to test are whether it’s true, and whether an informed reader would want to see it. If both of these conditions hold, then it should not be censored. However, if either condition fails to hold, then it’s okay to censor.
Either the forbidden post is false, in which case it does not deserve protection because it’s false, or it’s true, in which case it should be censored because no informed person should want to see it. In either case, people spreading it are doing a bad thing.
Either the forbidden post is false, in which case it does not deserve protection because it’s false,
Even if this is right the censorship extends to perhaps true conversations about why the post is false. Moreover, I don’t see what truth has to do with it. There are plenty of false claims made on this site that nonetheless should be public because understanding why they’re false and how someone might come to think that they are true are worthwhile endeavors.
The question here is rather straight forward: does the harm of the censorship outweigh the harm of letting people talk about the post. I can understand how you might initially think those who disagree with you are just responding to knee-jerk anti-censorship instincts that aren’t necessarily valid here. But from where I stand the arguments made by those who disagree with you do not fit this pattern. I think XiXi has been clear in the past about why the transparency concern does apply to SIAI. We’ve also seen arguments for why censorship in this particular case is a bad idea.
Either the forbidden post is false, in which case it does not deserve protection because it’s false, or it’s true, in which case it should be censored because no informed person should want to see it. In either case, people spreading it are doing a bad thing.
There are clearly more than two options here. There seem to be two points under contention:
It is/is not (1/2) reasonable to agree with the forbidden post.
It is/is not (3/4) desirable to know the contents of the forbidden post.
You seem to be restricting us to either 2+3 or 1+4. It seems that 1+3 is plausible (should we keep children from ever knowing about death because it’ll upset them?), and 2+4 seems like a good argument for restriction of knowledge (the idea is costly until you work through it, and the benefits gained from reaching the other side are lower than the costs).
But I personally suspect 2+3 is the best description, and that doesn’t explain why people trying to spread it are doing a bad thing. Should we delete posts on Pascal’s Wager because someone might believe it?
Either the forbidden post is false, in which case it does not deserve protection because it’s false, or it’s true, in which case it should be censored because no informed person should want to see it.
Excluded middle, of course: incorrect criterion. (Was this intended as a test?) It would not deserve protection if it were useless (like spam), not “if it were false.”
The reason I consider sufficient to keep it off LessWrong is that it actually hurt actual people. That’s pretty convincing to me. I wouldn’t expunge it from the Internet (though I might put a warning label on it), but from LW? Appropriate. Reposting it here? Rude.
Unfortunately, that’s also an argument as to why it needs serious thought applied to it, because if the results of decompartmentalised thinking can lead there, humans need to be able to handle them. As Vaniver pointed out, there are previous historical texts that have had similar effects. Rationalists need to be able to cope with such things, as they have learnt to cope with previous conceptual basilisks. So it’s legitimate LessWrong material at the same time as being inappropriate for here. Tricky one.
(To the ends of that “compartmentalisation” link, by the way, I’m interested in past examples of basilisks and other motifs of harmful sensation in idea form. Yes, I have the deleted Wikipedia article.)
Note that I personally found the idea itself silly at best.
The assertion that if a statement is not true, fails to alter political support, fails to provide instruction, and an informed reader wants to see that statement, it is therefore a bad thing to spread that statement and a OK thing to censor, is, um, far from uncontroversial.
To begin with, most fiction falls into this category. For that matter, so does most nonfiction, though at least in that case the authors generally don’t intend for it to be non-true.
The assertion that if a statement is not true, fails to alter political support, fails to provide instruction, and an informed reader wants to see that statement, it is therefore a bad thing to spread that statement and a OK thing to censor, is, um, far from uncontroversial.
No, you reversed a sign bit: it is okay to censor if an informed reader wouldn’t want to see it (and the rest of those conditions).
No, I don’t think so. You said “if either condition fails to hold, then it’s okay to censor.” If it isn’t true, and an informed reader wants to see it, then one of the two conditions failed to hold, and therefore it’s OK to censor.
Oops, you’re right—one more condition is required. The condition I gave is only sufficient to show that it fails to fall into a protected class, not that it falls in the class of things that should be censored; there are things which fall in neither class (which aren’t normally censored because that requires someone with a motive to censor it, which usually puts it into one of the protected classes). To make it worthy of censorship, there must additionally be a reason outside the list of excluded reasons to censor it.
I just have trouble understanding what you are saying. That might very well be my fault. I do not intent any hostile attack against you or the SIAI. I’m just curious, not worried at all. I do not demand anything. I’d like to learn more about you people, what you believe and how you arrived at your beliefs.
There is this particular case of the forbidden topic and I am throwing everything I got at it to see if the beliefs about it are consistent and hold water. That doesn’t mean that I am against censorship or that I believe it is wrong. I believe it is right but too unlikely (...). I believe that Yudkowsky and the SIAI are probably honest (although my gut feeling is to be very skeptic) but that there are good arguments for more transparency regarding the SIAI (if you believe it is as important as being portrayed). I believe that Yudkowsky is wrong about his risk estimation regarding the idea.
I just don’t understand your criticism of my past comments and that included telling me something about how I use politics (I don’t get it) and that I should accept that censorship sometimes is necessary (which I haven’t argued against).
There is this particular case of the forbidden topic and I am throwing everything I got at it to see if the beliefs about it are consistent and hold water.
There is this particular case of the forbidden topic and I am throwing everything I got at it to see if the beliefs about it are consistent and hold water.
The problem with that is that Eliezer and those who agree with him, including me, cannot speak freely about our reasoning on the issue, because we don’t want to spread the idea, so we don’t want to describe it and point to details about it as we describe our reasoning. If you imagine yourself in our position, believing the idea is dangerous, you could tell that you wouldn’t want to spread the idea in the process of explaining its danger either.
Under more normal circumstances, where the ideas we disagree about are not thought by anyone to be dangerous, we can have effective discussion by laying out our true reasons for our beliefs, and considering counter arguments that refer to the details of our arguments. Being cut off from our normal effective methods of discussion is stressful, at least for me.
I have been trying to persuade people who don’t know the details of the idea or don’t agree that it is dangerous that we do in fact have good reasons for believing it to be dangerous, or at least that this is likely enough that they should let it go. This is a slow process, as I think of ways to express my thoughts without revealing details of the dangerous idea, or explaining them to people who know but don’t understand those details. And this ends up involving talking to people who, because they don’t think the idea is dangerous and don’t take it seriously, express themselves faster and less carefully, and who have conflicting goals like learning or spreading the idea, or opposing censorship in general, or having judged for themselves the merits of censorship (from others just like them) in this case. This is also stressful.
I engage in this stressful topic, because I think it is important, both that people do not get hurt from learning about this idea, and that SIAI/Eliezer do not get dragged through mud for doing the right thing.
Sorry, but I am not here to help you get the full understanding you need to judge if the beliefs are consistent and hold water. As I have been saying, this is not a normal discussion. And seriously, you would be better of dropping it and finding something else to worry about. And if you think it is important, you can remember to track if SIAI/Eliezer/supporters like me engage in a pattern of making excuses to ban certain topics to protect some hidden agenda. But then please remember all the critical discussion that don’t get banned.
I have been trying to persuade people who don’t know the details of the idea or don’t agree that it is dangerous that we do in fact have good reasons for believing it to be dangerous, or at least that this is likely enough that they should let it go. This is a slow process, as I think of ways to express my thoughts without revealing details of the dangerous idea, or explaining them to people who know but don’t understand those details.
Note that this shouldn’t be possible other than through arguments from authority.
(I’ve just now formed a better intuitive picture of the reasons for danger of the idea, and saw some of the comments previously made unnecessarily revealing, where the additional detail didn’t actually serve the purpose of convincing people I communicated with, who lacked some of the prerequisites for being able to use that detail to understand the argument for danger, but would potentially gain (better) understanding of the idea. It does still sound silly to me, but maybe the lack of inferential stability of this conclusion should actually be felt this way—I expect that the idea will stop being dangerous in the following decades due to better understanding of decision theory.)
Does this theory of yours require that Eliezer Yudkowsky plus several other old-time Less Wrongians are holding the Idiot Ball and being really stupid about something that you can just see as obvious?
Now might be a good time to notice that you are confused.
Something to keep in mind when you reply to comments here is that you are the default leader of this community and its highest status member. This means comments that would be reasonably glib or slightly snarky from other posters can come off as threatening and condescending when made by you. They’re not really threatening but they can instill in their targets strong fight-or-flight responses. Perhaps this is because in the ancestral environment status challenges from group leaders were far more threatening to our ancestor’s livelihood than challenges from other group members. When you’re kicking out trolls it’s a sight to see, but when you’re rhetorically challenging honest interlocutors it’s probably counter-productive. I had to step away from the computer because I could tell that even if I was wrong the feelings this comment provoked weren’t going to let me admit it (and you weren’t even actually mean, just snobby).
As to your question, I don’t think my understanding of the idea requires anyone to be an idiot. In fact from what you’ve said I doubt we’re that far a part on the matter of how threatening the idea is. There may be implications I haven’t thought through that you have and there maybe general responses to implications I’ve thought of that you haven’t. I often have trouble telling how much intelligence I needed to get somewhere but I think I’ve applied a fair amount in this case. Where I think we probably diverge significantly is in our estimation of the cost of the censorship which I think is more than high enough to outweigh the risk of making Roko’s idea public. It is at least plausible that you are underestimating this cost due to biases resulting from you social position in this group and your organizational affiliation.
I’ll note that, as wedrifid suggested, your position also seems to assume that quite a few Less Wrongians are being really stupid and can’t see the obvious. Perhaps those who have expressed disagreement with your decision aren’t quite as old-time as those who have. And perhaps this is because we have not internalized important concepts or accessed important evidence required to see the danger in Roko’s idea. But it is also noteworthy that the people who have expressed disagreement have mostly been outside the Yudkowsky/SIAI cluster relative to those who have agreed with you. This suggests that they might be less susceptible to the biases that may be affecting your estimation of the cost of the censorship.
I am a bit confused as I’m not totally sure the explanations I’ve thought of or seen posted for your actions sufficiently explain them- but that’s just the kind of uncertainty one always expects in disagreements. Are you not confused? If I didn’t think there was a downside to the censorship I would let it go. But I think the downside is huge, in particular I think the censorship makes it much harder to get more people to take Friendliness seriously as a scholarly field by people beyond the SIAI circle. I’m not sure you’re humble enough to care about that (that isn’t meant as a character attack btw). It makes the field look like a joke and makes its leading scholar look ridiculous. I’m not sure you have the political talents to recognize that. It also slightly increases the chances of someone not recognizing this failure mode (the one in Roko’s post) when it counts. I think you might be so sure (or so focused on the possibility that) you’re going to be the one flipping the switch in that situation that you aren’t worried enough about that.
It seems to me that the natural effect of a group leader persistently arguing from his own authority is Evaporative Cooling of Group Beliefs. This is of course conducive to confirmation bias and corresponding epistemological skewing for the leader; things which seem undesirable for somebody in Eliezer’s position. I really wish that Eliezer was receptive to taking this consideration seriously.
It seems to me that the natural effect of a group leader persistently arguing from his own authority is Evaporative Cooling of Group Beliefs. This is of course conducive to confirmation bias and corresponding epistemological skewing for the leader; things which seem undesirable for somebody in Eliezer’s position. I really wish that Eliezer was receptive to taking this consideration seriously.
The thing is he usually does. That is one thing that has in the past set Eliezer apart from Robin and impressed me about Eliezer. Now it is almost as though he has embraced the evaporative cooling concept as an opportunity instead of a risk and gone and bought himself a blowtorch to force the issue!
Maybe, given the credibility he has accumulated on all these other topics, you should be willing to trust him on the one issue on which he is asserting this authority and on which it is clear that if he is right, it would be bad to discuss his reasoning.
Maybe, given the credibility he has accumulated on all these other topics, you should be willing to trust him on the one issue on which he is asserting this authority and on which it is clear that if he is right, it would be bad to discuss his reasoning.
The well known (and empirically verified) weakness in experts of the human variety is that they tend to be systematically overconfident when it comes to judgements that fall outside their area of exceptional performance—particularly when the topic is one just outside the fringes.
When it comes to blogging about theoretical issues of rationality Eliezer is undeniably brilliant. Yet his credibility specifically when it comes to responding to risks is rather less outstanding. In my observation he reacts emotionally and starts making rookie mistakes of rational thought and action. To the point when I’ve very nearly responded ‘Go read the sequences!’ before remembering that he was the flipping author and so should already know better.
Also important is the fact that elements of the decision are about people, not game theory. Eliezer hopefully doesn’t claim to be an expert when it comes to predicting or eliciting optimal reactions in others.
Was it not clear that I do not assign particular credence to Eliezer when it comes to judging risks? I thought I expressed that with considerable emphasis.
I’m aware that you disagree with my conclusions—and perhaps even my premises—but I can assure you that I’m speaking directly to the topic.
Maybe, given the credibility he has accumulated on all these other topics, you should be willing to trust him on the one issue on which he is asserting this authority and on which it is clear that if he is right, it would be bad to discuss his reasoning.
I do not consider this strong evidence as there are many highly intelligent and productive people who hold crazy beliefs:
Francisco J. Ayala who “…has been called the “Renaissance Man of Evolutionary Biology” is a geneticist ordained as a Dominican priest. “His “discoveries have opened up new approaches to the prevention and treatment of diseases that affect hundreds of millions of individuals worldwide…”
Francis Collins (geneticist, Human Genome Project) noted for his landmark discoveries of disease genes and his leadership of the Human Genome Project (HGP) and described by the Endocrine Society as “one of the most accomplished scientists of our time” is a evangelical Christian.
Peter Duesberg (a professor of molecular and cell biology at the University of California, Berkeley) claimed that AIDS is not caused by HIV, which made him so unpopular that his colleagues and others have — until recently — been ignoring his potentially breakthrough work on the causes of cancer.
Georges Lemaître (a Belgian Roman Catholic priest) proposed what became known as the Big Bang theory of the origin of the Universe.
Kurt Gödel (logician, mathematician and philosopher) who suffered from paranoia and believed in ghosts. “Gödel, by contrast, had a tendency toward paranoia. He believed in ghosts; he had a morbid dread of being poisoned by refrigerator gases; he refused to go out when certain distinguished mathematicians were in town, apparently out of concern that they might try to kill him.”
Mark Chu-Carroll (PhD Computer Scientist, works for Google as a Software Engineer) “If you’re religious like me, you might believe that there is some deity that created the Universe.” He is running one of my favorite blogs, Good Math, Bad Math, and writes a lot on debunking creationism and other crackpottery.
Nassim Taleb (the author of the 2007 book (completed 2010) The Black Swan) does believe: Can’t track reality with science and equations. Religion is not about belief. We were wiser before the Enlightenment, because we knew how to take knowledge from incomplete information, and now we live in a world of epistemic arrogance. Religious people have a way of dealing with ignorance, by saying “God knows”.
Kevin Kelly (editor) is a devout Christian. Writes pro science and technology essays.
I could continue this list with people like Ted Kaczynski or Roger Penrose. I just wanted show that intelligence and rational conduct do not rule out the possibility of being wrong about some belief.
Taleb quote doesn’t qualify. (I won’t comment on others.)
I should have made more clearly that it is not my intention to indicate that I believe that those people, or crazy ideas in general, are wrong. But there are a lot of smart people out there who’ll advocate opposing ideas. Using their reputation of being highly intelligent to follow through on their ideas is in my opinion not a very good idea in itself. I could just believe Freeman Dyson that existing simulation models of climate contain too much error to reliably predict future trends. I could believe Peter Duesberg that HIV does not cause aids, after all he is a brilliant molecular biologist. But I just do not think that any amount of reputation is enough evidence to believe extraordinary claims uttered by such people. And in the case of Yudkowsky, there doesn’t even exist much reputation and no great achievements at all that would justify some strong belief in his infallibility. What there exists in Yudkowsky’s case seems to be strong emotional commitment. I just can’t tell if he is honest. If he really believes that he’s working on a policy for some future superhuman intelligence that will rule the universe, then I’m going to be very careful. Not because it is wrong, but because such beliefs imply huge payoffs. Not that I believe he is the disguised Dr. Evil, but can we be sure enough to just trust him with it? Censorship of certain ideas does bear more evidence against him as it does in favor of his honesty.
How extensively have you searched for experts who made correct predictions outside their fields of expertise? What would you expect to see if you just searched for experts making predictions outside their field of expertise and then determined if that prediction were correct? What if you limited your search to experts who had expressed the attitude Eliezer expressed in Outside the Laboratory?
I just wanted show that intelligence and rational conduct do not rule out the possibility of being wrong about some belief.
“Rule out”? Seriously? What kind of evidence is it?
You extracted the “rule out” phrase from the sentence:
I just wanted show that intelligence and rational conduct do not rule out the possibility of being wrong about some belief.
From within the common phrase ‘do not rule out the possibility’ no less!
None of this affects my point that ruling out the possibility is the wrong, (in fact impossible), standard.
You then make a reference to ‘0 and 1s not probabilities’ with exaggerated incredulity.
Not exaggerated. XiXiDu’s post did seem to be saying: here are these examples of experts being wrong so it is possible that an expert is wrong in this case, without
saying anything useful about how probable it is for this particular expert to be wrong on this particular issue.
To put it mildly this struck me as logically rude and in general poor form.
You have made an argument accusing me of logical rudeness that, quite frankly, does not stand up to scrutiny.
Better evidence than I’ve ever seen in support of the censored idea. I have these well-founded principles, free speech and transparency, and weigh them against the evidence I have in favor of censoring the idea. That evidence is merely 1.) Yudkowsky’s past achievements, 2.) his output and 3.) intelligence. That intelligent people have been and are wrong about certain ideas while still being productive and right about many other ideas is evidence to weaken #3. That people lie and deceive to get what they want is evidence against #1 and #2 and in favor of transparency and free speech, which are both already more likely to have a positive impact than the forbidden topic is to have a negative impact.
And what are you trying to tell me with this link? I haven’t seen anyone stating numeric probability estimations regarding the forbidden topic. And I won’t state one either, I’ll just say that it is subjectively improbable enough to ignore it because there are possible too many very-very-low-probability events to take into account (for every being that will harm me if I don’t do X there is another being that will harm me if I do X, which cancel out each other). But if you’d like to pull some number out of thin air, go ahead. I won’t because I don’t have enough data to even calculate the probability of AI going FOOM versus a slow development.
You have failed to address my criticisms of you points, that you are seeking out only examples that support your desired conclusion, and that you are ignoring details that would allow you to construct a narrower, more relevant reference class for your outside view argument.
And what are you trying to tell me with this link?
I was telling you the “ruling out the possibility” is the wrong, (in fact impossible), standard.
You have failed to address my criticisms of you points, that you are seeking out only examples that support your desired conclusion.
Only now I understand your criticism. I do not seek out examples to support my conclusion but to weaken your argument that one should trust Yudkowsky because of his previous output. I’m aware that Yudkowsky can very well be right about the idea but do in fact believe that the risk is worth taking. Have I done extensive research on how often people in similar situations have been wrong? Nope. No excuses here, but do you think there are comparable cases of predictions that proved to be reliable? And how much research have you done in this case and about the idea in general?
I was telling you the “ruling out the possibility” is the wrong, (in fact impossible), standard.
I don’t, I actually stated a few times that I do not think that the idea is wrong.
Seeking out just examples that weaken my argument, when I never predicted that no such examples would exist, is the problem I am talking about.
My reason to weaken your argument is not that I want to be right but that I want feedback about my doubts. I said that 1.) people can be wrong, regardless of their previous reputation, 2.) that people can lie about their objectives and deceive by how they act in public (especially when the stakes are high), 3.) that Yudkowsky’s previous output and achievements are not remarkable enough to trust him about some extraordinary claim. You haven’t responded on why you tell people to believe Yudkowsky, in this case, regardless of my objections.
What made you think that supporting your conclusion and weakening my argument are different things?
I’m sorry if I made it appear as if I hold some particular belief. My epistemic state simply doesn’t allow me to arrive at your conclusion. To highlight this I argued in favor of what it would mean to not accept your argument, namely to stand to previously well-established concepts like free speech and transparency. Yes, you could say that there is no difference here, except that I do not care about who is right but what is the right thing to do.
people can be wrong, regardless of their previous reputation
Still, it’s incorrect to argue from existence of examples. You have to argue from likelihood. You’d expect more correctness from a person with reputation for being right than from a person with reputation for being wrong.
People can also go crazy, regardless of their previous reputation, but it’s improbable, and not an adequate argument for their craziness.
And you need to know what fact you are trying to convince people about, not just search for soldier-arguments pointing in the preferred direction. If you believe that the fact is that a person is crazy, you too have to recognize that “people can be crazy” is inadequate argument for this fact you wish to communicate, and that you shouldn’t name this argument in good faith.
(Craziness is introduced as a less-likely condition than wrongness to stress the structure of my argument, not to suggest that wrongness is as unlikely.)
I said that 1.) people can be wrong, regardless of their previous reputation, 2.) that people can lie about their objectives and deceive by how they act in public (especially when the stakes are high), 3.) that Yudkowsky’s previous output and achievements are not remarkable enough to trust him about some extraordinary claim.
I notice that Yudkowsky wasn’t always self-professed human-friendly. Consider this:
I must warn my reader that my first allegiance is to the Singularity, not humanity. I don’t know what the Singularity will do with us. I don’t know whether Singularities upgrade mortal races, or disassemble us for spare atoms. While possible, I will balance the interests of mortality and Singularity. But if it comes down to Us or Them, I’m with Them. You have been warned.
He’s changed his mind since. That makes it far, far less scary.
He has changed his mind about one technical point in meta-ethics. He now realizes that super-human intelligence does not automatically lead to super-human morality. He is now (IMHO) less wrong. But he retains a host of other (mis)conceptions about meta-ethics which make his intentions abhorrent to people with different (mis)conceptions. And he retains the arrogance that would make him dangerous to those he disagrees with, if he were powerful.
″… far, far less scary”? You are engaging in wishful thinking no less foolish than that for which Eliezer has now repented.
He is now (IMHO) less wrong. But he retains a host of other (mis)conceptions about meta-ethics which make his intentions abhorrent to people with different (mis)conceptions.
I’m not at all sure that I agree with Eliezer about most meta-ethics, and definitely disagree on some fairly important issues. But, that doesn’t make his views necessarily abhorrent. If Eliezer triggers a positive Singularity (positive in the sense that it reflects what he wants out of a Singularity, complete with CEV), I suspect that that will be a universe which I won’t mind living in. People can disagree about very basic issues and still not hate each others’ intentions. They can even disagree about long-term goals and not hate it if the other person’s goals are implemented.
If Eliezer triggers a positive Singularity (positive in the sense that it reflects what he wants out of a Singularity, complete with CEV), I suspect that that will be a universe which I won’t mind living in.
Have you ever have one of those arguments with your SO in which:
It is conceded that your intentions were good.
It is conceded that the results seem good.
The SO is still pissed because of the lack of consultation and/or presence of extrapolation?
I usually escape those confrontations by promising to consult and/or not extrapolate the next time. In your scenario, Eliezer won’t have that option.
When people point out that Eliezer’s math is broken because his undiscounted future utilities leads to unbounded utility, his response is something like “Find better math—discounted utility is morally wrong”.
When Eliezer suggests that there is no path to a positive singularity which allows for prior consultation with the bulk of mankind, my response is something like “Look harder. Find a path that allows people to feel that they have given their informed consent to both the project and the timetable—anything else is morally wrong.”
ETA: In fact, I would like to see it as a constraint on the meaning of the word “Friendly” that it must not only provide friendly consequences, but also, it must be brought into existence in a friendly way. I suspect that this is one of those problems in which the added constraint actually makes the solution easier to find.
Could you link to where Eliezer says that future utilities should not be discounted? I find that surprising, since uncertainty causes an effect roughly equivalent to discounting.
I would also like to point out that achieving public consensus about whether to launch an AI would take months or years, and that during that time, not only is there a high risk of unfriendly AIs, it is also guaranteed that millions of people will die. Making people feel like they were involved in the decision is emphatically not worth the cost
Could you link to where Eliezer says that future utilities should not be discounted?
He makes the case in this posting. It is a pretty good posting, by the way, in which he also points out some kinds of discounting which he believes are justified. This posting does not purport to be a knock-down argument against discounting future utility—it merely states Eliezer’s reasons for remaining unconvinced that you should discount (and hence for remaining in disagreement with most economic thinkers).
ETA: One economic thinker who disagrees with Eliezer is Robin Hanson. His response to Eliezer’s posting is also well worth reading.
Examples of Eliezer conducting utilitarian reasoning about the future without discounting are legion.
I find that surprising, since uncertainty causes an effect roughly equivalent to discounting.
Tim Tyler makes the same assertion about the effects of uncertainty. He backs the assertion with metaphor, but I have yet to see a worked example of the math. Can you provide one?
Of course, one obvious related phenomenon—it is even mentioned with respect in Eliezer’s posting—is that the value of a promise must be discounted with time due to the increasing risk of non-performance: my promise to scratch your back tomorrow is more valuable to you than my promise to scratch next week—simply because there is a risk that you or I will die in the interim, rendering the promise worthless. But I don’t see how other forms of increased uncertainty about the future should have the same (exponential decay) response curve.
achieving public consensus about whether to launch an AI would take months or years,
I find that surprising, since uncertainty causes an effect roughly equivalent to discounting.
Tim Tyler makes the same assertion about the effects of uncertainty. He backs the assertion with metaphor, but I have yet to see a worked example of the math. Can you provide one?
Most tree-pruning heuristics naturally cause an effect like temporal discounting. Resource limits mean that you can’t calculate the whole future tree—so you have to prune. Pruning normally means applying some kind of evaluation function early—to decide which branches to prune. The more you evaluate early, the more you are effectively valuing the near-present.
That is not maths—but hopefully it has a bit more detail than previously.
It doesn’t really address the question. In the A* algorithm the heuristic estimates of the objective function are supposed to be upper bounds on utility, not lower bounds. Furthermore, they are supposed to actually estimate the result of the complete computation—not to represent a partial computation exactly.
Furthermore, they are supposed to actually estimate the result of the complete computation—not to represent a partial computation exactly.
Reality check: a tree of possible futures is pruned at points before the future is completely calculated. Of course it would be nice to apply an evaluation function which represents the results of considering all possible future branches from that point on. However, getting one of those that produces results in a reasonable time would be a major miracle.
If you look at things like chess algorithms, they do some things to get a more accurate utility valuation when pruning—such as check for quiescence. However, they basically just employ a standard evaluation at that point—or sometimes a faster, cheaper approximation. If is sufficiently bad, the tree gets pruned.
However, getting one of those would be a major miracle.
We are living in the same reality. But the heuristic evaluation function still needs to be an estimate of the complete computation, rather than being something else entirely. If you want to estimate your own accumulation of pleasure over a lifetime, you cannot get an estimate of that by simply calculating the accumulation of pleasure over a shorter period—otherwise no one would undertake the pain of schooling motivated by the anticipated pleasure of high future income.
The question which divides us is whether an extra 10 utils now is better or worse than an additional 11 utils 20 years from now. You claim that it is worse. Period. I claim that it may well be better, depending on the discount rate.
I’m not sure I understand the question. What does it mean for a util to be ‘timeless’?
ETA: The question of the interaction of utility and time is a confusing one. In “Against Discount Rates”, Eliezer writes:
The idea that it is literally, fundamentally 5% more important that a poverty-stricken family have clean water in 2008, than that a similar family have clean water in 2009, seems like pure discrimination to me—just as much as if you were to discriminate between blacks and whites.
I think that Eliezer has expressed the issue in almost, but not quite, the right way. The right question is whether a decision maker in 2007 should be 5% more interested in doing something about the 2008 issue than about the 2009 issue. I believe that she should be. If only because she expects that she will have an entire year in the future to worry about the 2009 family without the need to even consider 2008 again. 2008′s water will be already under the bridge.
I’m sure someone else can explain this better than me, but: As I understand it, a util understood timelessly (rather than like money, which there are valid reasons to discount because it can be invested, lost, revalued, etc. over time) builds into how it’s counted all preferences, including preferences that interact with time. If you get 10 utils, you get 10 utils, full stop. These aren’t delivered to your door in a plain brown wrapper such that you can put them in an interest-bearing account. They’re improvements in the four-dimensional state of the entire universe over all time, that you value at 10 utils. If you get 11 utils, you get 11 utils, and it doesn’t really matter when you get them. Sure, if you get them 20 years from now, then they don’t cover specific events over the next 20 years that could stand improvement. But it’s still worth eleven utils, not ten. If you value things that happen in the next 20 years more highly than things that happen later, then utils according to your utility function will reflect that, that’s all.
That (timeless utils) is a perfectly sensible convention about what utility ought to mean. But, having adopted that convention, we are left with (at least) two questions:
Do I (in 2011) derive a few percent more utility from an African family having clean water in 2012 than I do from an equivalent family having clean water in 2013?
If I do derive more utility from the first alternative, am I making a moral error in having a utility function that acts that way?
I would answer yes to the first question. As I understand it, Eliezer would answer yes to the second question and would answer no to the first, were he in my shoes. I would claim that Eliezer is making a moral error in both judgments.
Do I (in 2011) derive a few percent more utility from an African family having clean water in 2012 than I do from an equivalent family having clean water in 2013?
Do you (in the years 2011, 2012, 2013, 2014) derive different relative utilities for these conditions? If so, it seems you have a problem.
I’m sorry. I don’t know what is meant by utility derived in 2014 from an event in 2012. I understand that the whole point of my assigning utilities in 2014 is to guide myself in making decisions in 2014. But no decision I make in 2014 can have an effect on events in 2012. So, from a decision-theoretic viewpoint, it doesn’t matter how I evaluate the utilities of past events. They are additive constants (same in all decision branches) in any computation of utility, and hence are irrelevant.
Or did you mean to ask about different relative utilities in the years before 2012? Yes, I understand that if I don’t use exponential discounting, then I risk inconsistencies.
The right question is whether a decision maker in 2007 should be 5% more interested in doing something about the 2008 issue than about the 2009 issue.
And that is a fact about 2007 decision maker, not 2008 family’s value as compared to 2009 family.
If, in 2007, you present me with a choice of clean water for a family for all of and only 2008 vs 2009, and you further assure me that these families will otherwise survive in hardship, and that their suffering in one year won’t materially affect their next year, and that I won’t have this opportunity again come this time next year, and that flow-on or snowball effects which benefit from an early start are not a factor here—then I would be indifferent to the choice.
If I would not be; if there is something intrinsic about earlier times that makes them more valuable, and not just a heuristic of preferring them for snowballing or flow-on reasons, then that is what Eliezer is saying seems wrong.
The right question is whether a decision maker in 2007 should be 5% more interested in doing something about the 2008 issue than about the 2009 issue. I believe that she should be. If only because she expects that she will have an entire year in the future to worry about the 2009 family without the need to even consider 2008 again. 2008′s water will be already under the bridge.
I would classify that as instrumental discounting. I don’t think anyone would argue with that—except maybe a superintelligence who has already exhausted the whole game tree—and for whom an extra year buys nothing.
FWIW, I genuinely don’t understand your perspective. The extent to which you discount the future depends on your chances of enjoying it—but also on factors like your ability to predict it—and your ability to influence it—the latter are functions of your abilities, of what you are trying to predict and of the current circumstances.
You really, really do not normally want to put those sorts of things into an agent’s utility function. You really, really do want to calculate them dynamically, depending on the agent’s current circumstances, prediction ability levels, actuator power levels, previous experience, etc.
Attempts to put that sort of thing into the utility function would normally tend to produce an inflexible agent, who has more difficulties in adapting and improving. Trying to incorporate all the dynamic learning needed to deal with the issue into the utility function might be possible in principle—but that represents a really bad idea.
Hopefully you can see my reasoning on this issue. I can’t see your reasoning, though. I can barely even imagine what it might possibly be.
Maybe you are thinking that all events have roughly the same level of unpredictability in the future, and there is roughly the same level of difficulty in influencing them, so the whole issue can be dealt with by one (or a small number of) temporal discounting “fudge factors”—and that evoution built us that way because it was too stupid to do any better.
You apparently denied that resource limitation results in temporal discounting. Maybe that is the problem (if so, see my other reply here). However, now you seem to have acknowledged that an extra year of time to worry in helps with developing plans. What I can see doesn’t seem to make very much sense.
You really, really do not normally want to put those sorts of things into an agent’s utility function.
I really, really am not advocating that we put instrumental considerations into our utility functions. The reason you think I am advocating this is that you have this fixed idea that the only justification for discounting is instrumental. So every time I offer a heuristic analogy explaining the motivation for fundamental discounting, you interpret it as a flawed argument for using discounting as a heuristic for instrumental reasons.
Since it appears that this will go on forever, and I don’t discount the future enough to make the sum of this projected infinite stream of disutility seem small, I really ought to give up. But somehow, my residual uncertainty about the future makes me think that you may eventually take Cromwell’s advice.
You really, really do not normally want to put those sorts of things into an agent’s utility function.
I really, really am not advocating that we put instrumental considerations into our utility functions. The reason you think I am advocating this is that you have this fixed idea that the only justification for discounting is instrumental.
To clarify: I do not think the only justification for discounting is instrumental. My position is more like: agents can have whatever utility functions they like (including ones with temporal discounting) without having to justify them to anyone.
However, I do think there are some problems associated with temporal discounting. Temporal discounting sacrifices the future for the sake of the present. Sometimes the future can look after itself—but sacrificing the future is also something which can be taken too far.
Axelrod suggested that when the shadow of the future grows too short, more defections happen. If people don’t sufficiently value the future, reciprocal altruism breaks down. Things get especially bad when politicians fail to value the future. We should strive to arrange things so that the future doesn’t get discounted too much.
Instrumental temporal discounting doesn’t belong in ultimate utility functions. So, we should figure out what temporal discounting is instrumental and exclude it.
If we are building a potentially-immortal machine intelligence with a low chance of dying and which doesn’t age, those are more causes of temporal discounting which could be discarded as well.
What does that leave? Not very much, IMO. The machine will still have some finite chance of being hit by a large celestial body for a while. It might die—but its chances of dying vary over time; its degree of temporal discounting should vary in response—once again, you don’t wire this in, you let the agent figure it out dynamically.
But the heuristic evaluation function still needs to be an estimate of the complete computation, rather than being something else entirely. If you want to estimate your own accumulation of pleasure over a lifetime, you cannot get an estimate of that by simply calculating the accumulation of pleasure over a shorter period—otherwise no one would undertake the pain of schooling motivated by the anticipated pleasure of high future income.
The point is that resource limitation makes these estimates bad estimates—and you can’t do better by replacing them with better estimates because of … resource limitation!
To see how resource limitation leads to temporal discounting, consider computer chess. Powerful computers play reasonable games—but heavily resource limited ones fall for sacrifice plays, and fail to make successful sacrifice gambits. They often behave as though they are valuing short-term gain over long term results.
A peek under the hood quickly reveals why. They only bother looking at a tiny section of the game tree near to the current position! More powerful programs can afford to exhaustively search that space—and then move on to positions further out. Also the limited programs employ “cheap” evaluation functions that fail to fully compensate for their short-term foresight—since they must be able to be executed rapidly. The result is short-sighted chess programs.
That resource limitation leads to temporal discounting is a fairly simple and general principle which applies to all kinds of agents.
To see how resource limitation leads to temporal discounting, consider computer chess.
Why do you keep trying to argue against discounting using an example where discounting is inappropriate by definition? The objective in chess is to win. It doesn’t matter whether you win in 5 moves or 50 moves. There is no discounting. Looking at this example tells us nothing about whether we should discount future increments of utility in creating a utility function.
Instead, you need to look at questions like this: An agent plays go in a coffee shop. He has the choice of playing slowly, in which case the games each take an hour and he wins 70% of them. Or, he can play quickly, in which case the games each take 20 minutes, but he only wins 60% of them. As soon as one game finishes, another begins. The agent plans to keep playing go forever. He gains 1 util each time he wins and loses 1 util each time he loses.
The main decision he faces is whether he maximizes utility by playing slowly or quickly. Of course, he has infinite expected utility however he plays. You can redefine the objective to be maximizing utility flow per hour and still get a ‘rational’ solution. But this trick isn’t enough for the following extended problem:
The local professional offers go lessons. Lessons require a week of time away from the coffee-shop and a 50 util payment. But each week of lessons turns 1% of your losses into victories. Now the question is: Is it worth it to take lessons? How many weeks of lessons are optimal? The difficulty here is that we need to compare the values of a one-shot (50 utils plus a week not playing go) with the value of an eternal continuous flow (the extra fraction of games per hour which are victories rather than losses). But that is an infinite utility payoff from the lessons, and only a finite cost, right? Obviously, the right decision is to take a week of lessons. And then another week after that. And so on. Forever.
Discounting of future utility flows is the standard and obvious way of avoiding this kind of problem and paradox. But now let us see whether we can alter this example to capture your ‘instrumental discounting due to an uncertain future’:
First, the obvious one. Our hero expects to die someday, but doesn’t know when. He estimates a 5% chance of death every year. If he is lucky, he could live for another century. Or he could keel over tomorrow. And when he dies, the flow of utility from playing go ceases. It is very well known that this kind of uncertainty about the future is mathematically equivalent to discounted utility in a certain future. But you seemed to be suggesting something more like the following:
Our hero is no longer certain what his winning percentage will be in the future. He knows that he experiences microstrokes roughly every 6 months, and that each incident takes 5% of his wins and changes them to losses. On the other hand, he also knows that roughly every year he experiences a conceptual breakthrough. And that each such breakthrough takes 10% of his losses and turns them into victories.
Does this kind of uncertainty about the future justify discounting on ‘instrumental grounds’? My intuition says ’No, not in this case, but there are similar cases in which discounting would work.” I haven’t actually done the math, though, so I remain open to instruction.
Why do you keep trying to argue against discounting using an example where discounting is inappropriate by definition? The objective in chess is to win. It doesn’t matter whether you win in 5 moves or 50 moves. There is no discounting. Looking at this example tells us nothing about whether we should discount future increments of utility in creating a utility function.
Temporal discounting is about valuing something happening today more than the same thing happening tomorrow.
Chess computers do, in fact discount. That is why they do prefer to mate you in twenty moves rather than a hundred.
The values of a chess computer do not just tell it to win. In fact, they are complex—e.g. Deep Blue had an evaluation function that was split into 8,000 parts.
Operation consists of maximising the utility function, after foresight and tree pruning. Events that take place in branches after tree pruning has truncated them typically don’t get valued at all—since they are not forseen. Resource-limited chess computers can find themselves preferring to promote a pawn sooner rather than later. They do so since they fail to see the benefit of sequences leading to promotion later.
So: we apparently agree that resource limitation leads to indifference towards the future (due to not bothering to predict it) - but I classify this as a kind of temporal discounting (since rewards in the future get ignored), wheras you apparently don’t.
Hmm. It seems as though this has turned out to be a rather esoteric technical question about exactly which set of phenomena the term “temporal discounting” can be used to refer to.
Earlier we were talking about whether agents focussed their attention on tomorrow—rather than next year. Putting aside the issue of whether that is classified as being “temporal discounting”—or not—I think the extent to which agents focus on the near-future is partly a consequence of resource limitation. Give the agents greater abilities and more resources and they become more future-oriented.
we apparently agree that resource limitation leads to indifference towards the future (due to not bothering to predict it)
No, I have not agreed to that. I disagree with almost every part of it.
In particular, I think that the question of whether (and how much) one cares about the future is completely prior to questions about deciding how to act so as to maximize the things one cares about. In fact, I thought you were emphatically making exactly this point on another branch.
But that is fundamental ‘indifference’ (which I thought we had agreed cannot flow from instrumental considerations). I suppose you must be talking about some kind of instrumental or ‘derived’ indifference. But I still disagree. One does not derive indifference from not bothering to predict—one instead derives not bothering to predict from being indifferent.
Furthermore, I don’t respond to expected computronium shortages by truncating my computations. Instead, I switch to an algorithm which produces less accurate computations at lower computronium costs.
but I classify this as a kind of temporal discounting (since rewards in the future get ignored), wheras you apparently don’t.
And finally, regarding classification, you seem to suggest that you view truncation of the future as just one form of discounting, whereas I choose not to. And that this makes our disagreement a quibble over semantics. To which I can only reply: Please go away Tim.
Furthermore, I don’t respond to expected computronium shortages by truncating my computations. Instead, I switch to an algorithm which produces less accurate computations at lower computronium costs.
I think you would reduce how far you look forward if you were interested in using your resources intelligently and efficiently.
If you only have a million cycles per second, you can’t realistically go 150 ply deep into your go game—no matter how much you care about the results after 150 moves. You compromise—limiting both depth and breadth. The reduction in depth inevitably means that you don’t look so far into the future.
A lot of our communication difficulty arises from using different models to guide our intuitions. You keep imagining game-tree evaluation in a game with perfect information (like chess or go). Yes, I understand your point that in this kind of problem, resource shortages are the only cause of uncertainty—that given infinite resources, there is no uncertainty.
I keep imagining problems in which probability is built in, like the coffee-shop-go-player which I sketched recently. In the basic problem, there is no difficulty in computing expected utilities deeper into the future—you solve analytically and then plug in whatever value for t that you want. Even in the more difficult case (with the microstrokes) you can probably come up with an analytic solution. My models just don’t have the property that uncertainty about the future arises from difficulty of computation.
Right. The real world surely contains problems of both sorts. If you have a problem which is dominated by chaos based on quantum events then more resources won’t help. Whereas with many other types of problems more resources do help.
I recognise the existence of problems where more resources don’t help—I figure you probably recognise that there are problems where more resources do help—e.g. the ones we want intelligent machines to help us with.
The real world surely contains problems of both sorts.
Perhaps the real world does. But decision theory doesn’t. The conventional assumption is that a rational agent is logically omniscient. And generalizing decision theory by relaxing that assumption looks like it will be a very difficult problem.
The most charitable interpretation I can make of your argument here is that human agents, being resource limited, imagine that they discount the future. That discounting is a heuristic introduced by evolution to compensate for those resource limitations. I also charitably assume that you are under the misapprehension that if I only understood the argument, I would agree with it. Because if you really realized that I have already heard you, you would stop repeating yourself.
That you will begin listening to my claim that not all discounting is instrumental is more than I can hope for, since you seem to think that my claim is refuted each time you provide an example of what you imagine to be a kind of discounting that can be interpreted as instrumental.
That you will begin listening to my claim that not all discounting is instrumental is more than I can hope for, since you seem to think that my claim is refuted each time you provide an example of what you imagine to be a kind of discounting that can be interpreted as instrumental.
I am pretty sure that I just told you that I do not think that all discounting is instrumental. Here’s what I said:
I really, really am not advocating that we put instrumental considerations into our utility functions. The reason you think I am advocating this is that you have this fixed idea that the only justification for discounting is instrumental.
To clarify: I do not think the only justification for discounting is instrumental. My position is more like: agents can have whatever utility functions they like (including ones with temporal discounting) without having to justify them to anyone.
Agents can have many kinds of utility function! That is partly a consequence of there being so many different ways for agents to go wrong.
Being rational isn’t about your values, you can rationally pursue practially any goal. Epistemic rationality is a bit different—but I mosly ignore that as being unbiological.
Being moral isn’t really much of a constraint at all. Morality—and right and wrong—are normally with respect to a moral system—and unless a moral system is clearly specified, you can often argue all day about what is moral and what isn’t. Maybe some types of morality are more common than others—due to being favoured by the universe, or something like that—but any such context would need to be made plain in the discussion.
So, it seems (relatively) easy to make a temporal discounting agent that really values the present over the future—just stick a term for that in its ultimate values.
Are there any animals with ultimate temporal discounting? That is tricky, but it isn’t difficult to imagine natural selection hacking together animals that way. So: probably, yes.
Do I use ultimate temporal discounting? Not noticably—as far as I can tell. I care about the present more than the future, but my temporal discounting all looks instrumental to me. I don’t go in much for thinking about saving distant galaxies, though! I hope that further clarifies.
I should probably review around about now. Instead of that: IIRC, you want to wire temporal discounting into machines, so their preferences better match your own—whereas I tend to think that would be giving them your own nasty hangover.
The real world surely contains problems of both sorts.
Perhaps the real world does. But decision theory doesn’t. The conventional assumption is that a rational agent is logically omniscient. And generalizing decision theory by relaxing that assumption looks like it will be a very difficult problem.
Programs make good models. If you can program it, you have a model of it. We can actually program agents that make resource-limited decisions. Having an actual program that makes decisions is a pretty good way of modeling making resource-limited decisions.
Perhaps we have some kind of underlying disagreement about what it means for temporal discounting to be “instrumental”.
In your example of an agent with suffering from risk of death, my thinking is: this player might opt for a safer life—with reduced risk. Or they might choose to lead a more interesting but more risky life. Their degree of discounting may well adjust itself accordingly—and if so, I would take that as evidence that their discounting was not really part of their pure preferences, but rather was an instrumental and dynamic response to the observed risk of dying.
If—on the other hand—they adjusted the risk level of their lifestyle, and their level of temporal discounting remained unchanged, that would be cofirming evidence in favour of the hypothesis that their temporal discounting was an innate part of their ultimate preferences—and not instrumental.
Of course. My point is that observing if the discount rate changes with the risk tells you if the agent is rational or irrational, not if the discount rate is all instrumental or partially terminal.
Stepping back for a moment, terminal values represent what the agent really wants, and instrumental values are things sought en-route.
The idea I was trying to express was: if what an agent really wants is not temporally discounted, then instrumental temporal discounting will produce a predictable temporal discounting curve—caused by aging, mortality risk, uncertainty, etc.
Deviations from that curve would indicate the presence of terminal temporal discounting.
I have no disagreement at all with your analysis here. This is not fundamental discounting. And if you have decision alternatives which affect the chances of dying, then it doesn’t even work to model it as if it were fundamental.
You recently mentioned the possibility of dying in the interim. There’s also the possibility of aging in the interim. Such factors can affect utility calculations.
For example: I would much rather have my grandmother’s inheritance now than years down the line, when she finally falls over one last time—because I am younger and fitter now.
Significant temporal discounting makes sense sometimes—for example, if there is a substantial chance of extinction per unit time. I do think a lot of discounting is instrumental, though—rather than being a reflection of ultimate values—due to things like the future being expensive to predict and hard to influence.
My brain spends more time thinking about tomorrow than about this time next year—because I am more confident about what is going on tomorrow, and am better placed to influence it by developing cached actions, etc. Next year will be important too—but there will be a day before to allow me to prepare for it closer to the time, when I am better placed to do so. The difference is not because I will be older then—or because I might die in the mean time. It is due to instrumental factors.
Of course one reason this is of interest is because we want to know what values to program into a superintelligence. That superintelligence will probably not age—and will stand a relatively low chance of extinction per unit time. I figure its ultimate utility function should have very little temporal discounting.
The problem with wiring discount functions into the agent’s ultimate utility function is that that is what you want it to preserve as it self improves. Much discounting is actually due to resource limitation issues. It makes sense for such discounting to be dynamically reduced as more resources become cheaply available. It doesn’t make much sense to wire-in short-sightedness.
I don’t mind tree-pruning algorithms attempting to normalise partial evaluations at different times—so they are more directly comparable to each other. The process should not get too expensive, though—the point of tree pruning is that it is an economy measure.
Find a path that allows people to feel that they have given their informed consent to both the project and the timetable—anything else is morally wrong.
I suspect you want to replace “feel like they have given” with “give.”
Unless you are actually claiming that what is immoral is to make people fail to feel consulted, rather than to fail to consult them, which doesn’t sound like what you’re saying.
Find a path that allows people to feel that they have given their informed consent to both the project and the timetable—anything else is morally wrong.
I suspect you want to replace “feel like they have given” with “give.”
I think I will go with a simple tense change: “feel that they are giving”. Assent is far more important in the lead-up to the Singularity than during the aftermath.
Although I used the language “morally wrong”, my reason for that was mostly to make the rhetorical construction parallel. My preference for an open, inclusive process is a strong preference, but it is really more political/practical than moral/idealistic. One ought to allow the horses to approach the trough of political participation, if only to avoid being trampled, but one is not morally required to teach them how to drink.
Ah, I see. Sure, if you don’t mean morally wrong but rather politically impractical, then I withdraw my suggestion… I entirely misunderstood your point.
No, I did originally say (and mostly mean) “morally” rather than “politically”. And I should thank you for inducing me to climb down from that high horse.
But he retains a host of other (mis)conceptions about meta-ethics which make his intentions abhorrent to people with different (mis)conceptions.
I submit that I have many of the same misconceptions that Eliezer does; he changed his mind about one of the few places I disagree with him. That makes it far more of a change than it would be for you (one out of eight is a small portion, one out of a thousand is an invisible fraction).
Good point. And since ‘scary’ is very much a subjective judgment, that mean that I can’t validly criticize you for being foolish unless I have some way of arguing that yours and Eliezer’s positions in the realm of meta-ethics are misconceptions—something I don’t claim to be able to do.
So, if I wish my criticisms to be objective, I need to modify them. Eliezer’s expressed positions on meta-ethics (particularly his apparent acceptance of act-utilitarianism and his unwillingness to discount future utilities) together with some of his beliefs regarding the future (particularly his belief in the likelihood of a positive singularity and expansion of human population into the universe) make his ethical judgments completely unpredictable to many other people—unpredictable because the judgment may turn on subtle differences in the expect consequences of present day actions on people in the distant future. And, if one considers the moral judgments of another personal to be unpredictable, and that person is powerful, then one ought to consider that person scary. Eliezer is probably scary to many people.
True, but it has little bearing on whether Eliezer should be scary. That is, “Eliezer is scary to many people” is mostly a fact about many people, and mostly not a fact about Eliezer. The reverse of this (and what I base this distinction on) is that some politicians should be scary, and are not scary to many people.
I’m not sure the proposed modification helps: you seem to have expanded your criticism so far, in order to have them lead to the judgment you want to reach, that they cover too much.
I mean, sure, unpredictability is scarier (for a given level of power) than predictability. Agreed, But so what?
For example, my judgments will always be more unpredictable to people much stupider than I am than to people about as smart or smarter than I am. So the smarter I am, the scarier I am (again, given fixed power)… or, rather, the more people I am scary to… as long as I’m not actively devoting effort to alleviating those fears by, for example, publicly conforming to current fashions of thought. Agreed.
But what follows from that? That I should be less smart? That I should conform more? That I actually represent a danger to more people? I can’t see why I should believe any of those things.
You started out talking about what makes one dangerous; you have ended up talking about what makes people scared of one whether one is dangerous or not. They aren’t equivalent.
you seem to have expanded your criticism so far, in order to have them lead to the judgment you want to reach, that they cover too much.
Well, I hope I haven’t done that.
You started out talking about what makes one dangerous; you have ended up talking about what makes people scared of one whether one is dangerous or not.
Well, I certainly did that. I was trying to address the question more objectively, but it seems I failed. Let me try again from a more subjective, personal position.
If you and I share the same consequentialist values, but I know that you are more intelligent, I may well consider you unpredictable, but I won’t consider you dangerous. I will be confident that your judgments, in pursuit of our shared values, will be at least as good as my own. Your actions may surprise me, but I will usually be pleasantly surprised.
If you and I are of the same intelligence, but we have different consequentialist values (both being egoists, with disjoint egos, for example) then we can expect to disagree on many actions. Expecting the disagreement, we can defend ourselves, or even bargain our way to a Nash bargaining solution in which (to the extent that we can enforce our bargain) we can predict each others behavior to be that promoting compromise consequences.
If, in addition to different values, we also have different beliefs, then bargaining is still possible, though we cannot expect to reach a Pareto optimal bargain. But the more our beliefs diverge, regarding consequences that concern us, the less good our bargains can be. In the limit, when the things that matter to us are particularly difficult to predict, and when we each have no idea what the other agent is predicting, bargaining simply becomes ineffective.
Eliezer has expressed his acceptance of the moral significance of the utility functions of people in the far distant future. Since he believes that those people outnumber us folk in the present, that seems to suggest that he would be willing to sacrifice the current utility of us in favor of the future utility of them. (For example, the positive value of saving a starving child today does not outweigh the negative consequences on the multitudes of the future of delaying the Singularity by one day).
I, on the other hand, systematically discount the future. That, by itself, does not make Eliezer dangerous to me. We could strike a Nash bargain, after all. However, we inevitably also have different beliefs about consequences, and the divergence between our beliefs becomes greater the farther into the future we look. And consequences in the distant future are essentially all that matters to people like Eliezer—the present fades into insignificance by contrast. But, to people like me, the present and near future are essentially all that matter—the distant future discounts into insignificance.
So, Eliezer and I care about different things. Eliezer has some ability to predict my actions because he knows I care about short-term consequences and he knows something about how I predict short-term consequences. But I have little ability to predict Eliezer’s actions, because I know he cares primarily about long term consequences, and they are inherently much more unpredictable. I really have very little justification for modeling Eliezer (and any other act utilitarian who refuses to discount the future) as a rational agent.
I really have very little justification for modeling Eliezer (and any other act utilitarian who refuses to discount the future) as a rational agent.
I wish you would just pretend that they care about things a million times further into the future than you do.
The reason is that there are instrumental reasons to discount—the future disappears into a fog of uncertainty—and you can’t make decisions based on the value of things you can’t forsee.
The instrumental reasons fairly quickly dominate as you look further out—even when you don’t discount in your values. Reading your post, it seems as though you don’t “get” this, or don’t agree with it—or something.
Yes, the far-future is unpredictable—but in decision theory, that tends to make it a uniform grey—not an unpredictable black and white strobing pattern.
I wish you would just pretend that they care about things a million times further into the future than you do.
I don’t need to pretend. Modulo some mathematical details, it is the simple truth. And I don’t think there is anything irrational about having such preferences. It is just that, since I cannot tell whether or not what I do will make such people happy, I have no motive to pay any attention to their preferences.
Yes, the far-future is unpredictable—but in decision theory, that tends to make it a uniform grey—not an unpredictable black and white strobing pattern.
Yet, it seems that the people who care about the future do not agree with you on that. Bostrom, Yudkowsky, Nesov, et al. frequently invoke assessments of far-future consequences (sometimes in distant galaxies) in justifying their recommendations.
I wish you would just pretend that they care about things a million times further into the future than you do.
I don’t need to pretend. Modulo some mathematical details, it is the simple truth.
We have crossed wires here. What I meant is that I wish you would stop protesting about infinite utilities—and how non-discounters are not really even rational agents—and just model them as ordinary agents who discount a lot less than you do.
Objections about infinity strike me as irrelevant and uninteresting.
It is just that, since I cannot tell whether or not what I do will make such people happy, I have no motive to pay any attention to their preferences.
Is that your true objection? I expect you can figure out what would make these people happy fairly easily enough most of the time—e.g. by asking them.
Yes, the far-future is unpredictable—but in decision theory, that tends to make it a uniform grey—not an unpredictable black and white strobing pattern.
Yet, it seems that the people who care about the future do not agree with you on that. Bostrom, Yudkowsky, Nesov, et al. frequently invoke assessments of far-future consequences (sometimes in distant galaxies) in justifying their recommendations.
Indeed. That is partly poetry, though (big numbers make things seem important) - and partly because they think that the far future will be highly contingent on near future events.
The thing they are actually interested in influencing is mostly only a decade or so out. It does seem quite important—significant enough to reach back to us here anyway.
If what you are trying to understand is far enough away to be difficult to predict, and very important, then that might cause some oscillations. That is hardly a common situation, though.
Most of the time, organisms act as though want to become ancestors. To do that,
the best thing they can do is focus on having some grandkids. Expanding their circle of care out a few generations usually makes precious little difference to their actions. The far future is unforseen, and usually can’t be directly influenced. It is usually not too relevant. Usually, you leave it to your kids to deal with.
It is just that, since I cannot tell whether or not what I do will make such people happy, I have no motive to pay any attention to their preferences.
Is that your true objection? I expect you can figure out what would make these people happy fairly easily enough most of the time—e.g. by asking them.
That is a valid point. So, I am justified in treating them as rational agents to the extent that I can engage in trade with them. I just can’t enter into a long-term Nash bargain with them in which we jointly pledge to maximize some linear combination of our two utility functions in an unsupervised fashion. They can’t trust me to do what they want, and I can’t trust them to judge their own utility as bounded.
I think this is back to the point about infinities. The one I wish you would stop bringing up—and instead treat these folk as though they are discounting only a teeny, tiny bit.
Frankly, I generally find it hard to take these utilitarian types seriously in the first place. A “signalling” theory (holier-than-thou) explains the unusually high prevalance of utilitarianism among moral philosophers—and an “exploitation” theory explains its prevalance among those running charitable causes (utilitarianism-says-give-us-your-money). Those explanations do a good job of modelling the facts about utilitarianism—and are normally a lot more credible than the supplied justifications—IMHO.
I think this is back to the point about infinities.
Which suggests that we are failing to communicate. I am not surprised.
The one I wish you would stop bringing up—and instead treat these folk as though they are discounting only a teeny, tiny bit.
I do that! And I still discover that their utility functions are dominated by huge positive and negative utilities in the distant future, while mine are dominated by modest positive and negative utilities in the near future. They are still wrong even if they fudge it so that their math works.
I think this is back to the point about infinities.
Which suggests that we are failing to communicate. I am not surprised.
I went from your “I can’t trust them to judge their own utility as bounded” to your earlier “infinity” point. Possibly I am not trying very hard here, though...
My main issue was you apparently thinking that you couldn’t predict their desires in order to find mutually beneficial trades. I’m not really sure if this business about not being able to agree to maximise some shared function is a big deal for you.
Mm. OK, so you are talking about scaring sufficiently intelligent rationalists, not scaring the general public. Fair enough.
What you say makes sense as far as it goes, assuming some mechanism for reliable judgments about people’s actual bases for their decisions. (For example, believing their self-reports.)
But it seems the question that should concern you is not whether Eliezer bases his decisions on predictable things, but rather whether Eliezer’s decisions are themselves predictable.
Put a different way: by your own account, the actual long-term consequences don’t correlate reliably with Eliezer’s expectations about them… that’s what it means for those consequences to be inherently unpredictable. And his decisions are based on his expectations, of course, not on the actual future consequences. So it seems to follow that once you know Eliezer’s beliefs about the future, whether those beliefs are right or wrong is irrelevant to you: that just affects what actually happens in the future, which you systematically discount anyway.
So if Eliezer is consistent in his beliefs about the future, and his decisions are consistently grounded in those beliefs, I’m not sure what makes him any less predictable to me than you are.
Of course, his expectations might not be consistent. Or they might be consistent but beyond your ability to predict. Or his decisions might be more arbitrary than you suggest here. For that matter, he might be lying outright. I’m not saying you should necessarily trust him, or anyone else.
But those same concerns apply to everybody, whatever their professed value structure. I would say the same things about myself.
So it seems to follow that once you know Eliezer’s beliefs about the future, whether those beliefs are right or wrong is irrelevant to you: that just affects what actually happens in the future, which you systematically discount anyway.
But Eliezer’s beliefs about the future continue to change—as he gains new information and completes new deductions. And there is no way that he can practically keep me informed of his beliefs—neither he nor I would be willing to invest the time required for that communication. But Eliezer’s beliefs about the future impact his actions in the present, and those actions have consequences both in the near and distant future. From my point of view, therefore, his actions have essentially random effects on the only thing that matters to me—the near future.
Absolutely. But who isn’t that true of? At least Eliezer has extensively documented his putative beliefs at various points in time, which gives you some data points to extrapolate from.
I have no complaints regarding the amount of information about Eliezer’s beliefs that I have access to. My complaint is that Eliezer, and his fellow non-discounting act utilitarians, are morally driven by the huge differences in utility which they see as arising from events in the distant future—events which I consider morally irrelevant because I discount the future. No realistic amount of information about beliefs can alleviate this problem. The only fix is for them to start discounting. (I would have added “or for me to stop discounting” except that I still don’t know how to handle the infinities.)
Given that they predominantly care about things I don’t care about, and that I predominantly care about things they don’t worry about, we can only consider each other to be moral monsters.
You and I seem to be talking past each other now. It may be time to shut this conversation down.
Given that they predominantly care about things I don’t care about, and that I predominantly care about things they don’t worry about, we can only consider each other to be moral monsters.
Ethical egoists are surely used to this situation, though. The world is full of people who care about extremely different things from one another.
Yes. And if they both mostly care about modest-sized predictable things, then they can do some rational bargaining. Trouble arises when one or more of them has exquisitely fragile values—when they believe that switching a donation from one charity to another destroys galaxies.
I expect your decision algorithm will find a way to deal with people who won’t negotiate on some topics—or who behave in manner you have a hard time predicting. Some trouble for you, maybe—but probably not THE END OF THE WORLD.
From my point of view, therefore, his actions have essentially random effects on the only thing that matters to me—the near future.
Looking at the last 10 years, there seems to be some highly-predictable fund raising activity, and a lot of philosophising about the importance of machine morality.
I see some significant patterns there. It is not remotely like a stream of random events. So: what gives?
Sure, the question of whether a superintelligence will construct a superior morality to that which natural selection and cultural evolution have constructed on Earth is in some sense a narrow technical question. (The related question of whether the phrase “superior morality” even means anything is, also.)
But it’s a technical question that pertains pretty directly to the question of whose side one envisions oneself on.
That is, if one answers “yes,” it can make sense to ally with the Singularity rather than humanity (assuming that even means anything) as EY-1998 claims to, and still expect some unspecified good (or perhaps Good) result. Whereas if one answers “no,” or if one rejects the very idea that there’s such a thing as a superior morality, that justification for alliance goes away.
That said, I basically agree with you, though perhaps for different reasons than yours.
That is, even after embracing the idea that no other values, even those held by a superintelligence, can be superior to human values, one is still left with the same choice of alliances. Instead of “side with humanity vs. the Singularity,” the question involves a much narrower subset: “side with humanity vs. FAI-induced Singularity,” but from our perspective it’s a choice among infinities.
Of course, advocates of FAI-induced Singularity will find themselves saying that there is no conflict, really, because an FAI-induced Singularity will express by definition what’s actually important about humanity. (Though, of course, there’s no guarantee that individual humans won’t all be completely horrified by the prospect.)
Though, of course, there’s no guarantee that individual humans won’t all be completely horrified by the prospect.
Recall that after CEV extrapolates current humans’ volitions and construes a coherent superposition, the next step isn’t “do everything that superposition says”, but rather, “ask that superposition the one question ‘Given the world as it is right now, what program should we run next?’, run that program, and then shut down”. I suppose it’s possible that our CEV will produce an AI that immediately does something we find horrifying, but I think our future selves are nicer than that… or could be nicer than that, if extrapolated the right way, so I’d consider it a failure of Friendliness if we get a “do something we’d currently find horrifying for the greater good” AI if a different extrapolation strategy would have resulted in something like a “start with the most agreeable and urgent stuff, and other than that, protect us while we grow up and give us help where we need it” AI.
I really doubt that we’d need an AI to do anything immediately horrifying to the human species in order to allow it to grow up into an awesome fun posthuman civilization, so if CEV 1.0 Beta 1 appeared to be going in that direction, that would probably be considered a bug and fixed.
(shrug) Sure, if you’re right that the “most urgent and agreeable stuff” doesn’t happen to press a significant number of people’s emotional buttons, then it follows that not many people’s emotional buttons will be pressed.
But there’s a big difference between assuming that this will be the case, and considering it a bug if it isn’t.
Either I trust the process we build more than I trust my personal judgments, or I don’t.
If I don’t, then why go through this whole rigamarole in the first place? I should prefer to implement my personal judgments. (Of course, I may not have the power to do so, and prefer to join more powerful coalitions whose judgments are close-enough to mine. But in that case CEV becomes a mere political compromise among the powerful.)
If I do, then it’s not clear to me that “fixing the bug” is a good idea.
That is, OK, suppose we write a seed AI intended to work out humanity’s collective CEV, work out some next-step goals based on that CEV and an understanding of likely consequences, construct a program P to implement those goals, run P, and quit.
Suppose that I am personally horrified by the results of running P. Ought I choose to abort P? Or ought I say to myself “Oh, how interesting: my near-mode emotional reactions to the implications of what humanity really wants are extremely negative. Still, most everybody else seems OK with it. OK, fine: this is not going to be a pleasant transition period for me, but my best guess is still that it will ultimately be for the best.”
Is there some number of people such that if more than that many people are horrified by the results, we ought to choose to abort P?
Does the question even matter? The process as you’ve described it doesn’t include an abort mechanism; whichever choice we make P is executed.
Ought we include such an abort mechanism? It’s not at all clear to me that we should. I can get on a roller-coaster or choose not to get on it, but giving me a brake pedal on a roller coaster is kind of ridiculous.
Sure, the question of whether a superintelligence will construct a superior morality to that which natural selection and cultural evolution have constructed on Earth is in some sense a narrow technical question.
Apparently he changed his mind about a bunch of things.
On what appears to be their current plan, the SIAI, don’t currently look very dangerous, IMHO.
Eray Ozkural recently complained: “I am also worried that backwards people and extremists will threaten us, and try to dissuade us from accomplishing our work, due to your scare tactics.”
I suppose that sort of thing is possible—but my guess is that they are mostly harmless.
(Parenthetical about how changing your mind, admitting you were wrong, oops, etc, is a good thing).
Yes, I agree. I don’t really believe that he only learnt how to disguise his true goals. But I’m curious if you would be satisfied with his word alone if he would be able to run a fooming AI next week only if you gave your OK?
He has; this is made abundantly clear in the Metaethics sequence and particularly the “coming of age” sequence. That passage appears to be a reflection of the big embarrassing mistake he talked about, when he thought that he knew nothing about true morality (se “Could Anything Be Right?”) and that a superintelligence with a sufficiently “unconstrained” goal system (or what he’d currently refer to as “a rock”) would necessarily discover the ultimate true morality, so that whatever this superintelligence ended up doing would necessarily be the right thing, whether that turned out to consist of giving everyone a volcano lair full of catgirls/boys or wiping out humanity and reshaping the galaxy for its own purposes.
Needless to say, that is not his view anymore; there isn’t even any “Us or Them” to speak of anymore. Friendly AIs aren’t (necessarily) people, and certainly won’t be a distinct race of people with their own goals and ambitions.
Yes, I’m not suggesting that he is just signaling all that he wrote in the sequences to persuade people to trust him. I’m just saying that when you consider what people are doing for much less than shaping the whole universe to their liking, one might consider some sort of public or third-party examination before anyone is allowed to launch a fooming AI.
It will probably never come to it anyway. Not because the SIAI is not going to succeed but if it told anyone that it is even close to implementing something like CEV then the whole might of the world would crush it (if the world didn’t turn rational until then). Because to say that you are going to run a fooming AI will be interpreted as trying to take over all power and rule the universe. I suppose this is also the most likely reason for the SIAI to fail. The idea is out and once people notice that fooming AI isn’t just science fiction they will do everything to stop anyone from either implementing one at all or to run their own before anyone else does. And who’ll be the first competitor to take out in the race to take over the universe? The SIAI of course, just search Google. I guess it would have been a better idea to make this a stealth project from day one. But that train has left.
Anyway, if the SIAI does succeed one can only hope that Yudkowsky is not Dr. Evil in disguise. But even that would still be better than a paperclip maximizer. I assign more utility to a universe adjusted to Yudkowsky’s volition (or the SIAI) than paperclips (I suppose even if that means I’ll not “like” what happens to me then).
I’m just saying that when you consider what people are doing for much less than shaping the whole universe to their liking, one might consider some sort of public or third-party examination before anyone is allowed to launch a fooming AI.
I don’t see who is going to enforce that. Probably nobody.
What we are fairly likely to see is open-source projects getting more limelight. It is hard to gather mindshare if your strategy is: trust the code to us. Relatively few programmers are likely to buy into such projects—unless you pay them to do so.
So you take him at his word that he’s working in your best interest. You don’t think it is necessary to supervise the SIAI while working towards friendly AI. But once they finished their work, ready to go, you are in favor of some sort of examination before they can implement it. Is that correct?
I don’t think human selfishness vs. public interest is much of a problem with FAI; everyone’s interests with respect to FAI are well correlated, and making an FAI which specifically favors its creator doesn’t give enough extra benefit over an FAI which treats everyone equally to justify the risks (that the extra term will be discovered, or that the extra term introduces a bug). Not even for a purely selfish creator; FAI scenarios just doesn’t leave enough room for improvement to motivate implementing something else.
On the matter of inspecting AIs before launch, however, I’m conflicted. On one hand, the risk of bugs is very serious, and the only way to mitigate it is to have lots of qualified people look at it closely. On the other hand, if the knowledge that a powerful AI was close to completion became public, it would be subject to meddling by various entities that don’t understand what they’re doing. and it would also become a major target for espionage by groups of questionable motives and sanity who might create UFAIs. These risks are difficult to balance, but I think secrecy is the safer choice, and should be the default.
If your first paragraph turns out to be true, does that change anything with respect to the problem of human and political irrationality? My worry is that even if there is only one rational solution that everyone should favor, how likely is it that people understand and accept this? That might be no problem given the current perception. If the possibility of fooming AI will still be ignored at the point it will be possible to implement friendliness (CEV etc.), then there will be no opposition. So some quick quantum leaps towards AGI will likely allow the SIAI to follow through on it. But my worry is that if the general public or governments notice this possibility and take it serious, it will turn into a political mess never seen before. The world would have to be dramatically different for the big powers to agree on something like CEV. I still think this is the most likely failure mode in case the SIAI succeeds in defining friendliness before someone else runs a fooming AI. Politics.
These risks are difficult to balance, but I think secrecy is the safer choice, and should be the default.
I agree. But is that still possible? After all we’re writing about it in public. Although to my knowledge the SIAI never suggested that it would actually create a fooming AI, only come up with a way to guarantee its friendliness. But what you said in your second paragraph would suggest that the SIAI would also have to implement friendliness or otherwise people will take advantage of it or simply mess it up.
Although to my knowledge the SIAI never suggested that it would actually create a fooming AI, only come up with a way to guarantee its friendliness.
This?
The Singularity Institute was founded on the theory that in order to get a Friendly artificial intelligence, someone has got to build one. So, we’re just going to have an organization whose mission is: build a Friendly AI. That’s us.”
You don’t think it is necessary to supervise the SIAI while working towards friendly AI. But once they finished their work, ready to go, you are in favor of some sort of examination before they can implement it.
Probably it would be easier to run the examination during the SIAI’s work, rather than after. Certainly it would save more lives. So, supervise them, so that your examination is faster and more thorough. I am not in favour of pausing the project, once complete, to examine it if it’s possible to examine in in operation.
I do not seek out examples to support my conclusion but to weaken your argument that one should trust Yudkowsky because of his previous output.
You shouldn’t seek to “weaken an argument”, you should seek what is the actual truth, and then maybe ways of communicating your understanding. (I believe that’s what you intended anyway, but think it’s better not to say it this way, as a protective measure against motivated cognition.)
I took wedrifid’s point as being that whether EY is right or not, the bad effect described happens. This is part of the lose-lose nature of the original problem (what to do about a post that hurt people).
I don’t think this rhetoric is applicable. Several very intelligent posters have deemed the idea dangerous; a very intelligent you deems it safe. You argue they are wrong because it is ‘obviously safe’.
Eliezer is perfectly correct to point out that, on the whole of it, ‘obviously it is safe’ just does not seem like strong enough evidence when it’s up against a handful of intelligent posters who appear to have strong convictions.
You argue they are wrong because it is ‘obviously safe’.
Pardon? I don’t believe I’ve said any such thing here or elsewhere. I could of course be mistaken—I’ve said a lot of things and don’t recall them all perfectly. But it seems rather unlikely that I did make that claim because it isn’t what I believe.
I should have known I wouldn’t get away with that, eh? I actually don’t know if you oppose the decision because you think the idea is safe, or because you think that censorship is wronger than the idea is dangerous, or whether you even oppose the decision at all and were merely pointing out appeals to authority. If you could fill me on the details, I could re-present the argument as it actually applies.
Thankyou, and yes I can see the point behind what you were actually trying to say. It just important to me that I am not misrepresented (even though you had no malicious intent).
There are obvious (well, at least theoretically deducible based on the kind of reasoning I tend to discuss or that used by harry!mor) reasons why it would be unwise to give a complete explanation of all my reasoning.
I will say that ‘censorship is wronger’ is definitely not the kind of thinking I would use. Indeed, I’ve given examples of things that I would definitely censor. Complete with LOTR satire if I recall. :)
I have not only been warned, but I have stared the basilisk in the eyes, and I’m still here typing about it.
This isn’t evidence about that hypothesis, it’s expected that most certainly nothing happens. Yet you write for rhetorical purposes as if it’s supposed to be evidence against the hypothesis. This constitutes either lying or confusion (I expect it’s unintentional lying, with phrases produced without conscious reflection about their meaning, so a little of both lying and confusion).
I have not only been warned, but I have stared the basilisk in the eyes, and I’m still here typing about it.
The point we are trying to make is that we think the people who stared the basilisk in the eyes and metaphorically turned to stone are stronger evidence.
The point we are trying to make is that we think the people who stared the basilisk in the eyes and metaphorically turned to stone are stronger evidence.
I get that. But I think it’s important to consider both positive and negative evidence- if someone’s testimony that they got turned to stone is important, so are the testimonies of people who didn’t get turned to stone.
The question to me is whether the basilisk turns people to stone or people turn themselves into stone. I prefer the second because it requires no magic powers on the part of the basilisk. It might be that some people turn to stone when they see goatse for the first time, but that tells you more about humans and how they respond to shock than about goatse.
Indeed, that makes it somewhat useful to know what sort of things shock other people. Calling this idea ‘dangerous’ instead of ’dangerous to EY” strikes me as mind projection.
But I think it’s important to consider both positive and negative evidence- if someone’s testimony that they got turned to stone is important, so are the testimonies of people who didn’t get turned to stone.
It might be that some people turn to stone when they see goatse for the first time, but that tells you more about humans and how they respond to shock than about goatse.
I generally find myself in support of people who advocate a policy of keeping people from seeing Goatse.
I generally find myself in support of people who advocate a policy of keeping people from seeing Goatse.
I’m not sure how to evaluate this statement. What do you mean by “keeping people from seeing Goatse”? Banning? Voluntarily choosing not to spread it? A filter like the one proposed in Australia that checks every request to the outside world?
I am much more sympathetic to “keeping goatse off of site X” than “keeping people from seeing goatse,” and so that’s a reasonable policy. If your site is about posting pictures of cute kittens, then goatse is not a picture of a cute kitten.
However, it seems to me that suspected Langford basilisks are part of the material of LessWrong. Imagine someone posted in the discussion “hey guys, I really want to be an atheist but I can’t stop worrying about whether or not the Rapture will happen, and if it does life will suck.” It seems to me that we would have a lot to say to them about how they could approach the situation more rationally.
And, if Langford basilisks exist, religion has found them. Someone got a nightmare because of Roko’s idea, but people fainted upon hearing Sinners in the Hands of an Angry God. Why are we not looking for the Perseus for this Medusa? If rationality is like an immune system, and we’re interested in refining our rationality, we ought to be looking for antibodies.
However, it seems to me that suspected Langford basilisks are part of the material of LessWrong.
It seems to me that Eliezer’s response as moderator of LessWrong strongly implies that he does not believe this is the case. Your goal, then, would be to convince Eliezer that it ought to be part of the LessWrong syllabus, as it were. Cialdini’s Influence and other texts would probably advise you to work within his restrictions and conform to his desires as much as practical—on a site like LessWrong, though, I am not sure how applicable the advice would be, and in any case I don’t mean to be prescriptive about it.
Okay, but more than four people have engaged with the idea. Should we take a poll?
The problem of course is that majorities often believe stupid things. That is why a free marketplace of ideas free from censorship is a really good thing! The obvious thing to do is exchange information until agreement but we can’t do that, at least not here.
Also, the people who think it should be censored all seem to disagree about how dangerous the idea really is, suggesting it isn’t clear how it is dangerous. It also seems plausible that some people have influenced the thinking of other people- for example it looks like Roko regretted posting after talking to Eliezer. While Roko’s regret is evidence that Eliezer is right, it isn’t the same as independent/blind confirmation that the idea is dangerous.
The problem of course is that majorities often believe stupid things.
When you give all agents equal weight, sure. Without taking a poll of anything except my memory, Eliezer+Roko+VladNesov+Alicorn are against, DavidGerard+waitingforgodel+vaniver are for. Others are more sidelined than supporting a particular side.
The obvious thing to do is exchange information until agreement but we can’t do that, at least not here.
Aumann agreement works in the case of hidden information—all you need are posteriors and common knowledge of the event alone.
While Roko’s regret is evidence that Eliezer is right, it isn’t the same as independent/blind confirmation that the idea is dangerous.
Roko increased his estimation and Eliezer decreased his estimation—and the amounts they did so are balanced according to the strength of their private signals. Looking at two Aumann-agreed conclusions gives you the same evidence as looking that the pre-Aumann (differing) conclusions—the same way that 10, 10 gives you the same average as 5, 15.
Others are more sidelined than supporting a particular side.
I would prefer you not treat people avoiding a discussion as evidence that people don’t differentially evaluate the assertions made in that discussion.
Doing so creates a perverse incentive whereby chiming in to say “me too!” starts to feel like a valuable service, which would likely chase me off the site altogether. (Similar concerns apply to upvoting comments I agree with but don’t want to see more of.)
If you are seriously interested in data about how many people believe or disbelieve certain propositions, there exist techniques for gathering that data that are more reliable than speculating.
If you aren’t interested, you could just not bring it up.
I would prefer you not treat people avoiding a discussion as evidence that people don’t differentially evaluate the assertions made in that discussion.
I treat them as not having given me evidence either way. I honestly don’t know how I could treat them otherwise.
Okay. It is not that they give no evidence by remaining out of the discussion—it is that the evidence they give is spread equally over all possibilities. I don’t know enough about these people to say that discussion-abstainers are uniformly in support or in opposition to the idea. The best I can do is assume they are equally distributed between support and opposition, and not incorrectly constrain my anticipations.
the best I can do is assume they are equally distributed between support and opposition
You can do better than that along a number of different dimensions.
But even before getting there, it seems important to ask whether our unexpressed beliefs are relevant.
That is, if it turned out that instead of “equally distributed between support and opposition”, we are 70% on one side, or 90%, or 99%, or that there are third options with significant membership, would that information significantly affect your current confidence levels about what you believe?
If our unexpressed opinions aren’t relevant, you can just not talk about them at all, just like you don’t talk about millions of other things that you don’t know and don’t matter to you.
If they are relevant, one thing you could do is, y’know, research. That is, set up a poll clearly articulating the question and the answers that would affect your beliefs and let people vote for their preferred answers. That would be significantly better than assuming equal distribution.
Another thing you could do, if gathering data is unpalatable, is look at the differential characteristics of groups that express one opinion or another and try to estimate what percentage of the site shares which characteristics.
would that information significantly affect your current confidence levels about what you believe?
Yes. In the absence of actual evidence (which seems dangerous to gather in the case of this basilisk), I pretty much have to go by expressed opinions. To my mind, it was like trying to count the results of experiments that haven’t been performed yet.
I did not seek out more information because it was a throwaway line in an argument attempting to explain to people why it appears their voices are being ignored. I personally am on the side of censoring the idea, not having understood it at all when it first posted, and that may have bled into my posts (I should have exercised stronger control over that) but I am not arguing for censorship. I am arguing why, when someone says “it’s not dangerous!”, some people aren’t coming around to their perspective.
I don’t intend to argue for the censorship of the idea unless sorely pressed.
** I’m confused. On the one hand, you say knowing the popularity of various positions is important to you in deciding your own beliefs about something potentially dangerous to you and others. On the other hand, you say it’s not worth seeking more information about and was just a throwaway line in an argument. I am having a hard time reconciling those two claims… you seem to be trying to have it both ways. I suspect I’ve misunderstood something important.
** I didn’t think you were arguing for censorship. Or against it. Actually, I have long since lost track of what most participants in this thread are arguing for, and in some cases I’m not sure they themselves know.
** I agree with you that the existence of knowledgeable people who think something is dangerous is evidence that it’s dangerous.
** Since it seems to matter: for my own part, I rate the expected dangerousness of “the basilisk” very low, and the social cost to the group of the dispute over “censoring” it significantly higher but still low.
** I cannot see why that should be of any evidentiary value whatsoever, to you or anyone else. Whether I’m right or wrong, my position is a pretty easy-to-reach one; it’s the one you arrive at in the absence of other salient beliefs (like, for example, the belief that EY/SIAI is a highly reliable estimator of potential harm done by “basilisks” in general, or the belief that the specific argument for the harmfulness of this basilisk is compelling). And most newcomers will lack those other beliefs. So I expect that quite a few people share my position—far more than 50% -- but I can’t see why you ought to find that fact compelling. That a belief is very widely shared among many many people like me who don’t know much about the topic isn’t much evidence for anything.
(nods) I’m a great believer in it. Especially in cases where a disagreement has picked up momentum, and recognizable factions have started forming… for example, if people start suggesting that those who side with the other team should leave the group. My confidence in my ability to evaluate an argument honestly goes up when I genuinely don’t know what team that argument is playing for.
I suspect I’ve obfuscated it, actually. The popularity of various positions is not intrinsically important to me—in fact, I give professions of believe about as little credit as I can get away with. This specific case is such that every form of evidence I find stronger (reasoning through the argument logically for flaws; statistical evidence about its danger) is not available. With a dearth of stronger evidence, I have to rely on weak evidence—but “the evidence is weak” is not an argument for privileging my own unsubstantiated position.
I don’t feel the need to collect weak evidence … I should, in this case. I was following a heuristic of not collecting weak evidence (waste of effort) without noticing that there was no stronger evidence.
Why are people’s beliefs of any value? Everyone has the ability to reason. All (non-perfect) reasoners fail in some way or another; if I look at many (controlling for biased reasoning) it gives me more of a chance to spot the biases—I have a control to compare it to.
This case is a special case; some people do have evidence. They’ve read the basilisk, applied their reasoning and logic, and deduced that it is / is not dangerous. These peoples’ beliefs are to be privileged over people who have not read the basilisk. I can’t access private signals like that—I don’t want to read a potential basilisk. So I make a guess at how strong their private signal is (this is why I care about their rationality) and use that as weak evidence for or against.
If seeking harder evidence wasn’t dangerous (and it usually isn’t) I would have done that instead.
The sentence I quoted sounded to me as though you were treating those of us who’ve remained “sidelined” as evidence of something. But if you were instead just bringing us up as an example of something that provides no evidence of anything, and if that was clear to everyone else, then I’m content.
Without taking a poll of anything except my memory, Eliezer+Roko+VladNesov+Alicorn are against, DavidGerard+waitingforgodel+vaniver are for.
I’m for. I believe Tim Tyler is for.
Aumann agreement works in the case of hidden information—all you need are posteriors and common knowledge of the event alone.
Human’s have this unfortunate feature of not being logically omniscient. In such cases where people don’t see all the logical implications of an argument we can treat those implications as hidden information. If this wasn’t the case then the censorship would be totally unnecessary as Roko’s argument didn’t actually include new information. We would have all turned to stone already.
Roko increased his estimation and Eliezer decreased his estimation—and the amounts they did so are balanced according to the strength of their private signals.
There is no way for you to have accurately assessed this. Roko and Eliezer aren’t idealized Bayesian agents, it is extremely unlikely they performed a perfect Aumann agreement. If one is more persuasive than the other for reasons other than the evidence they share than their combined support for the proposition may not be worth the same as two people who independently came to support the proposition. Besides which, according to you, what information did they share exactly?
I had a private email conversation with Eliezer that did involve a process of logical discourse, and another with Carl.
Also, when I posted the material, I hadn’t thought it through. One I had thought it through, I realized that I had accidentally said more than I should have done.
David_Gerard, Jack, timtyler, waitingforgodel, and Vaniver do not currently outweigh Eliezer_Yudkowsky, FormallyknownasRoko, Vladimir_Nesov, and Alicorn, as of now, in my mind.
It does not need to be a perfect Aumann agreement; a merely good one will still reduce the chances of overcounting or undercounting either side’s evidence well below the acceptable limits.
There is no way for you to have accurately assessed this. Roko and Eliezer aren’t idealized Bayesian agents, it is extremely unlikely they performed a perfect Aumann agreement.
They are approximations of Bayesian agents, and it is extremely likely they performed an approximate Aumann agreement.
To settle this particular question, however, I will pay money. I promise to donate 50 dollars to the Singularity Institute for Artificial Intelligence, independent of other plans to donate, if Eliezer confirms that he did revise his estimate down; or if he confirms that he did not revise his estimate down. Payable within two weeks of Eliezer’s comment.
I’m curious: if he confirms instead that the change in his estimate, if there was one, was small enough relative to his estimate that he can’t reliably detect it or detect its absence, although he infers that he updated using more or less the same reasoning you use above, will you donate or not?
I would donate even if he said that he revised his estimate upwards.
I would then seriously reconsider my evaluation of him, but as it stands the offer is for him to weigh in at all, not weigh in on my side.
edit: I misparsed your comment. That particular answer would dance very close to ‘no comment’, but unless it seemed constructed that way on purpose, I would still donate.
Yeah, that’s fair. One of the things I was curious about was, in fact, whether you would take that answer as a hedge, but “it depends” is a perfectly legitimate answer to that question.
For the posterior to equal or lower than the prior, Vaniver would have to be more a rationalist than Eliezer, Roko, and you put together.
How many of me would there have to be for that to work?
Also, why is rationalism the risk factor for this basilisk? Maybe the basilisk only turns to stone people with brown eyes (or the appropriate mental analog).
How many of me would there have to be for that to work?
Only one; I meant ‘you’ in that line to refer to Vlad. It does raise the question “how many people disagree before I side with them instead of Eliezer/Roko/Vlad”. And the answer to that is … complicated. Each person’s rationality, modified by how much it was applied in this particular case, is the weight I give to their evidence; then the full calculation of evidence for and against should bring my prior to within epsilon but preferably below my original prior for me to decide the idea is safe.
Also, why is rationalism the risk factor for this basilisk?
Rationalism is the ability to think well and this is a dangerous idea. If it were a dangerous bacterium then immune system would be the risk factor.
Rationalism is the ability to think well and this is a dangerous idea. If it were a dangerous bacterium then immune system would be the risk factor.
Generally, if your immune system is fighting something, you’re already sick. Most pathogens are benign or don’t have the keys to your locks. This might be a similar situation- the idea is only troubling if your lock fits it- and it seems like then there would be rational methods to erode that fear (like the immune system mobs an infection).
The analogy definitely breaks down, doesn’t it? What I had in mind was Eliezer, Roko, and Vlad saying “I got sick from this infection” and you saying “I did not get sick from this infection”—I would look at how strong each person’s immune system is.
So if Eliezer, Roko, and Vlad all had weak immune systems and yours was quite robust, I would conclude that the bacterium in question is not particularly virulent. But if three robust immune systems all fell sick, and one robust immune system did not, I would be forced to decide between some hypotheses:
the first three are actually weak immune systems
the fourth was not properly exposed to the bacterium
the fourth has a condition that makes it immune
the bacterium is not virulent, the first three got unlucky
On the evidence I have, the middle two seem more likely than the first and last hypotheses.
I agree- my money is on #3 (but I’m not sure whether I would structure is as “fourth is immune” or “first three are vulnerable”- both are correct, but which is more natural word to use depends on the demographic response).
Sorry, I should not have included censoring specifically. Change the “read:”s to ‘engages, reacts negatively’, ‘engages, does not react negatively’ and the argument still functions.
Perhaps it would be more effective at improving human rationality to expose people to ideas like this with the sole purpose of overcoming that sort of terror?
You would need a mechanism for actually encouraging them to “overcome” the terror, rather than reinforce it. Otherwise you might find that your subjects are less rational after this process than they were before.
being terrified of very unlikely terrible events is a known human failure mode
one wonders how something like that might have evolved, doesn’t one? What happened to all the humans who came with the mutation that made them want to find out whether the sabre-toothed tiger was friendly?
one wonders how something like that might have evolved, doesn’t one? What happened to all the humans who came with the mutation that made them want to find out whether the sabre-toothed tiger was friendly?
I don’t see how very unlikely events that people knew the probability of would have been part of the evolutionary environment at all.
In fact, I would posit that the bias is most likely due to having a very high floor for probability. In the evolutionary environment things with probability you knew to be <1% would be unlikely to ever be brought to your attention. So not having any good method for intuitively handling probabilities between 1% and zero would be expected.
In fact, I don’t think I have an innate handle on probability to any finer grain than ~10% increments. Anything more than that seems to require mathematical thought.
But probably far more than 1% of cave-men who chose to seek out a sabre-tooth tiger to see if they were friendly died due to doing so.
The relevant question on an issue of personal safety isn’t “What % of the population die due to trying this?”
The relevant question is: “What % of the people who try this will die?”
In the first case, rollerskating downhill, while on fire, after having taken arsenic would seem safe (as I suspect no-one has ever done precisely that)
one wonders how something like that might have evolved, doesn’t one?
No, really, one doesn’t wonder. It’s pretty obvious. But if we’ve gotten to the point where “this bias paid off in the evolutionary environment!” is actually used as an argument, then we are off the rails of refining human rationality.
What’s wrong with using “this bias paid off in the evolutionary environment!” as an argument? I think people who paid more attention to this might make fewer mistakes, especially in domains where there isn’t a systematic, exploitable difference between EEA and now.
The evolutionary environment contained enetities capable of dishing out severe punishments, unertainty, etc.
If anything, I think that the heuristic that an idea “obviously” can’t be dangerous is the problem, not the heuristic that one should take care around possibilities of strong penalites.
It is a fine argument for explaining the widespread occcurrence of fear. However, today humans are in an environment where their primitive paranoia is frequently triggered by inappropriate stimulii.
He says “we” are the healthiest and safest humans ever to live, but I’m very skeptical that this refers specifically to Americans rather than present day first world nation citizens in general.
Yes, we are, in fact, safer than in the EEA, in contemporary USA.
But still, there are some real places where danger is real, like the Bronx or scientology or organized crime or a walking across a freeway. So, don’t go rubbishing the heuristic of being frightened of potentially real danger.
I think it would only be legitimate to criticize fear itself on “outside view” grounds if we lived in a world with very little actual danger, which is not at all the case.
But still, there are some real places where danger is real, like the Bronx or scientology or organized crime or a walking across a freeway.
So, this may be a good way to approach the issue: loss to individual humans is, roughly speaking, finite. Thus, the correct approach to fear is to gauge risks by their chance of loss, and then discount if it’s not fatal.
So, we should be much less worried by a 1e-6 risk than a 1e-4 risk, and a 1e-4 risk than a 1e-2 risk. If you are more scared by a 1e-6 risk than a 1e-2 risk, you’re reasoning fallaciously.
Now, one might respond- “but wait! This 1e-6 risk is 1e5 times worse than the 1e-2 risk!”. But that seems to fall into the traps of visibility bias and privileging the hypothesis. If you’re considering a 1e-6 risk, have you worked out not just all the higher order risks, but also all of the lower order risks that might have higher order impact? And so when you have an idea like the one in question, which I would give a risk of 1e-20 for discussion’s sake, and you consider it without also bringing into your calculus essentially every other risk possible, you’re not doing it rigorously. And, of course, humans can’t do that computation.
Now, the kicker here is that we’re talking about fear. I might fear the loss of every person I know just as strongly as I fear the loss of every person that exists, but be willing to do more to prevent the loss of everyone that exists (because that loss is actually larger). Fear has psychological ramifications, not decision-theoretic ones. If this idea has 1e-20 chances of coming to pass, you can ignore it on a fear level, and if you aren’t, then I’m willing to consider that evidence you need help coping with fear.
I have a healthy respect for the adaptive aspects of fear. However, we do need an explanation for the scale and prevalence of irrational paranoia.
The picture of an ancestral water hole surrounded by predators helps us to understand the origins of the phenomenon. The ancestral environment was a dangerous and nasty place where people led short, brutish lives. There, living in constant fear made sense.
He always held that panic was the best means of survival. Back in the old days, his theory went, people faced with hungry sabre-toothed tigers could be divided into those who panicked and those who stood there saying, “What a magnificent brute!” or “Here pussy”.
Considering the extraordinary appeal that forbidden knowledge has even for the average person, let alone the exceptionally intellectually curious, I don’t think this is a very effective way to warn a person off of seeking out the idea in question. Far from deserving what they get, such a person is behaving in a completely ordinary manner, to exceptionally severe consequence.
Personally, I don’t want to know about the idea (at least not if it’s impossible without causing myself significant psychological distress to no benefit,) but I’ve also put significant effort into training myself out of responses such as automatically clicking links to shock sites that say “Don’t click this link!”
I would add that I wish I had never learned about any of these ideas. In fact, I wish I had never come across the initial link on the internet that caused me to think about transhumanism and thereby about the singularity;
Hmm. It is tricky to go back, I would imagine.
The material does come with some warnings, I believe. For instance, consider this one:
In fact, I wish I had never come across the initial link on the internet that caused me to think about transhumanism and thereby about the singularity
As I understand, you donate (and plan to in the future) to existential risk charities, and that is one of the consequences of you having come across that link. How does this compute into net negative, in your estimation, or are you answering a different question?
Sure I want to donate. But if you express it as a hypothetical choice between being a person who didn’t know about any of this and had no way of finding out, versus what I have now, I choose the former. Though since that is not an available choice, it is a somewhat academic question.
But if you express it as a hypothetical choice between being a person who didn’t know about any of this and had no way of finding out, versus what I have now, I choose the former.
I can’t believe to hear this from a person who wrote about Ugh fields. I can’t believe to read a plead for ignorance on a blog devoted to refining rationality. Ignorance is bliss, is that the new motto now?
Well look, one has to do cost/benefit calculations, not just blindly surge forward in some kind of post-enlightenment fervor. To me, it seems like there is only one positive term in the equation:: the altrustic value of giving money to some existential risk charity.
All the other terms are negative, at least for me. And unless I actually overcome excuses, akrasia, etc to donate a lot, I think it’ll all have been a mutually detrimental waste of time.
Not helping. I was referring to the the moral value of donations as an argument for choosing to know, as opposed to not knowing. You don’t seem to address that in your reply (did I miss something?).
Oh, I see. Well, I guess it depends upon how much I eventually donate and how much of an incremental difference that makes.
It would certainly be better to just donate, AND to also not know anything about anything dangerous. I’m not even sure that’s possible, though. For all we know, just knowing about any of this is enough to land you in a lot of trouble either in the causal future or elsewhere.
The only possible reason I can see for why one wouldn’t want to spread it is that its negative potential does outweigh its very-very-low-probability (and that only if you accept a long chain of previous beliefs).
I gather doing so would irritate our site’s host and moderator.
E.g both me and Nesov have been persuaded (once fully filled in) that this is really nasty stuff and shouldn’t be let out.
I wasn’t “filled in”, and I don’t know whether my argument coincides with Eliezer’s. I also don’t understand why he won’t explain his argument, if it’s the same as mine, now that content is in the open (but it’s consistent with, that is responds to the same reasons as, continuing to remove comments pertaining to the topic of the post, which makes it less of a mystery).
As a decision on expected utility under logical uncertainty, but extremely low confidence, yes. I can argue that it most certainly won’t be a bad thing (which I even attempted in comments to the post itself, my bad), the expectation of it being a bad thing derives from remaining possibility of those arguments failing. As Carl said, “that estimate is unstable in the face of new info” (which refers to his own argument, not necessarily mine).
For everyone who wants to know what this discussion is all about, the forbidden idea, here is something that does not resemble it except for its stupefying conclusions:
There’s this guy who has the idea that it might be rational to rob banks to donate the money to charities. He tells a friend at a bank about it who freaks out and urges him to shut up about it. Unfortunately some people who also work at the local bank overheard the discussion and it gave them horrible nightmares. Since they think the idea makes sense they now believe that everyone will starve to death if they don’t rob banks and donate the money to charities that try to feed the world. The friend working at the bank now gets really upset and tells the dude with the idea about this. He argues that this shows how dangerous the idea is and that his colleagues and everyone else who’s told about the idea and who is working for a bank might just rob their own banks.
An inconvenient detail here, that makes the idea slightly less likely, is that it isn’t talking about the local bank in your town, or any bank on Earth at all, but one located in the system of Epsilon Eridani. And the friend and people with nightmares are not working for some bank but a charity concerned with space colonization. To conclude that the idea is dangerous, they don’t just have to accept its overall premise but also that there are banks over at Epsilon Eridani, that it is worth it to rob them, that one can build the necessary spaceships to reach that place and so on. In other words you have to be completely nuts and should seek help if you seriously believe that the idea is dangerous.
And what is the secondary problem?
The problem is not the idea itself but that there obviously are people crazy enough to take it serious and who might commit to crazy things due to their beliefs. The problem is that everyone is concerned with space colonization but that we don’t want to have it colonized by some freaks with pirate spaceships to rob alien banks because of some crazy idea.
P.S. Any inconsistency in the above story is intended to resemble the real idea.
This analogy makes sense if you assume the conclusion that the argument for the post being a Basilisk is incorrect, but not as an argument for convincing people that it’s incorrect. To evaluate whether the argument is correct, you have to study the argument itself, there is no royal road (the conclusion can be studied in other ways, since particular proof can’t be demanded).
(See this summary of the structure of the argument.)
FWIW, loads of my comments were deleted by administrators at the time.
I was away for a couple of months while the incident took place and when I returned I actually used your user page to reconstruct most of the missing conversation (with blanks filled from other user pages and an alternate source). Yours was particularly useful because of how prolific you were with quoting those to whom you were replying. I still have ten pages of your user comments stored on my harddrive somewhere. :)
Yes, some weren’t fully deleted, but—IIRC—others were. If I am remembering this right, the first deleted post (Roko’s) left comments behind in people’s profiles, but with the second deleted post the associated comments were rendered completely inaccessible to everyone. At the time, I figured that the management was getting better at nuking people’s posts.
After that—rather curiously—some of my subsequent “marked deleted” posts remained visible to me when logged in—so I wasn’t even aware of what had been “marked deleted” to everyone else for most of the time—unless I logged out of the site.
Good idea. We should’ve started using this standard reference when the censorship complaints began, but at least henceforth.
Yes. THIS IS NOT CENSORSHIP. Just in case anyone missed it.
You are evidently confused about what the word means. The systematic deletion of any content that relates to an idea that the person with power does not wish to be spoken is censorship in the same way that threatening to (probabilistically) destroy humanity is terrorism. As in, blatantly obviously—it’s just what the words happen to mean.
Going around saying ‘this isn’t censorship’ while doing it would trigger all sorts of ‘crazy cult’ warning bells.
Yes, the acts in question can easily be denoted by the terms “blackmail” and “censorship.” And your final sentence is certainly true as well.
To avoid being called a cult, to avoid being a cult, and to avoid doing bad things generally, we should stop the definition debate and focus on whether people’s behavior has been appropriate. If connotation conundrums keep you quarreling about terms, pick variables (e.g. “what EY did”=E and “what WFG precommitted to doing, and in fact did”=G) and keep talking.
YES IT IS. In case anyone missed it. It isn’t Roko’s post we’re talking about right now
There is still a moral sense in which if, after careful thought, I decided that that material should not have been posted, then any posts which resulted solely from my post are in a sense a violation of my desire to not have posted it. Especially if said posts operate under the illusion that my original post was censored rather than retracted.
But in reality such ideas tend to propagate like the imp of the perverse: a gnawing desire to know what the “censored” material is, even if everyone who knows what it is has subsequently decided that they wished they didn’t! E.g both me and Nesov have been persuaded (once fully filled in) that this is really nasty stuff and shouldn’t be let out. (correct me if I am wrong).
This “imp of the perverse” property is actually part of the reason why the original post is harmful. In a sense, this is an idea-virus which makes people who don’t yet have it want to have it, but as soon as they have been exposed to it, they (belatedly) realize they really didn’t want to know about it or spread it.
Sigh.
The only people who seem to be filled in are you and Yudkowsky. I think Nesov just argues against it based on some very weak belief. As far as I can tell, I got all the material in question. The only possible reason I can see for why one wouldn’t want to spread it is that its negative potential does outweigh its very-very-low-probability (and that only if you accept a long chain of previous beliefs). It doesn’t. It also isn’t some genuine and brilliant idea that all this mystery mongering makes it seem to be. Everyone I sent it just laughed about it. But maybe you can fill me in?
If the idea is dangerous in the first place (which is very unlikely), it is only dangerous to people who understand it, because understanding it makes you vulnerable. The better you understand it and the more you think about it, the more vulnerable you become. In hindsight, I would prefer to never have read about the idea in question.
I don’t think this is a big issue, considering the tiny probability that the scenario will ever occur, but I am glad that discussing it continues to be discouraged and would appreciate it if people stopped needlessly resurrecting it over and over again.
This strikes me as tautological and/or confusing definitions. I’m happy to agree that the idea is dangerous to people who think it is dangerous, but I don’t think it’s dangerous and I think I understand it. To make an analogy, I understand the concept of hell but don’t think it’s dangerous, and so the concept of hell does not bother me. Does the fact that I do not have the born-again Christian’s fear of hell mean that they understand it and I don’t? I don’t see why it should.
I can’t figure out a way to explain this further without repropagating the idea, which I will not do. It is likely that there are one or more pieces of the idea which you are not familiar with or do not understand, and I envy your epistemological position.
Yes, but the concept of hell is easier to understand. From what I have read in the discussions, I have no idea how the Basilisk is supposed to work, while it’s quite easy to understand how hell is supposed to work.
Upvoted, agree strongly.
Upvoted, agree strongly.
If you people are this worried about reality, why don’t you work to support creating a Paperclip maximizer? It would have a lot of fun doing what it wants to do and everyone else would quickly die. Nobody ever after would have to fear what could possible happen to them at some point.
If you people want to try to turn the universe into a better place, at whatever cost, then why do you worry or wish to not know about potential obstacles? Both is irrational.
The forbidden topic seems to be a dangerous Ugh field for a lot of people here. You have to decide what you want and then follow through on it. Any self-inflicted pain just adds to the overall negative.
You do not understand what you are talking about.
The basilisk idea has no positive value. All it does is cause those who understand it to bear a very low probability of suffering incredible disutility at some point in the future. Explaining this idea to someone does them about as much good as slashing their tires.
I understand that but do not see that the description applies to the idea in question, insofar as it is in my opinion no more probable than fiction and that any likelihood is being outweighed by opposing ideas. There are however other well-founded ideas, free speech and transparency, that are being ignored. I also believe that people would benefit from talking about it and possible overcome and ignore it subsequently.
But I’m tired of discussing this topic and will do you the favor to shut up about it. But remember that I haven’t been the one who started this thread. It was Roko and whoever asked to delete Roko’s comment.
Look, you have three people all of whom think it is a bad idea to spread this. All are smart. Two initially thought it was OK to spread it.
Furthermore, I would add that I wish I had never learned about any of these ideas. In fact, I wish I had never come across the initial link on the internet that caused me to think about transhumanism and thereby about the singularity; I wish very strongly that my mind had never come across the tools to inflict such large amounts of potential self-harm with such small durations of inattention, uncautiousness and/or stupidity, even if it is all premultiplied by a small probability. (not a very small one, mind you. More like 1⁄500 type numbers here)
If this is not enough warning to make you stop wanting to know more, then you deserve what you get.
I wish you’d talk to someone other than Yudkowsky about this. You don’t need anyone to harm you, you already seem to harm yourself. You indulge yourself in self-inflicted psychological stress. As Seneca said, “there are more things that terrify us than there are that oppress us, and we suffer more often in opinion than in reality”. You worry and pay interest for debt that will likely never be made.
I read about quite a few smart people who hold idiot beliefs, I only consider this to be marginal evidence.
You’d rather be some ignorant pleasure maximizing device? For me truth is the most cherished good.
BS.
More so than not opening yourself up to a small risk of severe consequences? E.g. if you found a diary that clearly belonged to some organized crime boss, would you open it up and read it? I see this situation as analogous.
Really thought you were going to go with Tom Riddle on this one. Perfect line break for it :)
You are a truth seeker? Really? I think that makes you pretty rare and unusual!
There’s a lot of truth out there. Is there any pattern to which truths you are interested in?
Yes, I’d choose to eat from the tree of the knowledge of good and evil and tell God to fuck off.
So, as a gift: 63,174,774 + 6,761,374,774 = 6,824,549,548.
Or—if you don’t like that particular truth—care to say which truths you do like?
I can’t tell you, I cherry-pick what I want to know when it is hinted at. But generally most of all I want to know about truths that other agents don’t want me to know about.
There are thousands of truths I know that I don’t want you to know about. (Or, to be more precise, that I want you to not know about.) Are you really most interested in those, out of all the truths I know?
I think I’d be disturbed by that if I thought it were true.
I’m not sure that’s a very good heuristic—are you sure that truly describes the truths you care most about? It seems analogous to the fact that people are more motivated by a cause if they learn some people opposes it, which is silly.
Heh—OK. Thanks for the reply. Yes, that is not that bad a heuristic! Maybe someday you can figure this out in more detail. It is surely good to know what you want.
I love this reply. I don’t think it’s necessarily the best reply, and I don’t really even think it’s a polite reply, but it’s certainly one of the funniest ones I’ve seen here.
I see a lot more than three people here, most of whom are smart, and most of them think that Langford basilisks are fictional, and even if they aren’t, censoring them is the wrong thing to do. You can’t quarantine the internet, and so putting up warning signs makes more people fall into the pit.
I saw the original idea and the discussion around it, but I was (fortunately) under stress at the time and initially dismissed it as so implausible as to be unworthy of serious consideration. Given the reactions to it by Eliezer, Alicorn, and Roko, who seem very intelligent and know more about this topic than I do, I’m not so sure. I do know enough to say that, if the idea is something that should be taken seriously, it’s really serious. I can tell you that I am quite happy that the original posts are no longer present, because if they were I am moderately confident that I would want to go back and see if I could make more sense out of the matter, and if Eliezer, Alicorn, and Roko are right about this, making sense out of the matter would be seriously detrimental to my health.
Thankfully, either it’s a threat but I don’t understand it fully, in which case I’m safe, or it’s not a threat, in which case I’m also safe. But I am sufficiently concerned about the possibility that it’s a threat that I don’t understand fully but might be able to realize independently given enough thought that I’m consciously avoiding extended thought about this matter. I will respond to posts that directly relate to this one but am otherwise done with this topic—rest assured that, if you missed this one, you’re really quite all right for it!
This line of argument really bothers me. What does it mean for E, A, and R to seem very intelligent? As far as I can tell, the necessary conclusion is “I will believe a controversial statement of theirs without considering it.” When you word it like that, the standards are a lot higher than “seem very intelligent”, or at least narrower- you need to know their track record on decisions like this.
(The controversial statement is “you don’t want to know about X,” not X itself, by the way.)
I am willing to accept the idea that (intelligent) specialists in a field may know more about their field than nonspecialists and are therefore more qualified to evaluate matters related to their field than I.
Good point, though I would point out that you need E, A, and R to be specialists when it comes to how people react to X, not just X, and I would say there’s evidence that’s not true.
I agree, but I know what conclusion I would draw from the belief in question if I actually believed it, so the issue of their knowledge of how people react is largely immaterial to me in particular. I was mostly posting to provide a data point in favor of keeping the material off LW, not to attempt to dissolve the issue completely or anything.
You don’t need any specific kind of proof, you already have some state of knowledge about correctness of such statements. There is no “standard of evidence” for forming a state of knowledge, it just may be that without the evidence that meets that “standard” you don’t expect to reach some level of certainty, or some level of stability of your state of knowledge (i.e. low expectation of changing your mind).
Whatever man, go ahead and make your excuses, you have been warned.
I have not only been warned, but I have stared the basilisk in the eyes, and I’m still here typing about it. In fact, I have only cared enough to do so because it was banned, and I wanted the information on how dangerous it was to judge the wisdom of the censorship.
On a more general note, being terrified of very unlikely terrible events is a known human failure mode. Perhaps it would be more effective at improving human rationality to expose people to ideas like this with the sole purpose of overcoming that sort of terror?
I’ll just second that I also read it a while back (though after it was censored) and thought that it was quite interesting but wrong on multiple levels. Not ‘probably wrong’ but wrong like an invalid logic proof is wrong (though of course I am not 100% certain of anything). My main concern about the censorship is that not talking about what was wrong with the argument will allow the proliferation of the reasoning errors that left people thinking the conclusion was plausible. There is a kind of self-fulfilling prophesy involved in not recognizing these errors which is particularly worrying.
Consider this invalid proof that 1 = 2:
You could refute this by pointing out that step (5) involved division by (x—y) = (y—y) = 0, and you can’t divide by 0.
But imagine if someone claimed that the proof is invalid because “you can’t represent numbers with letters like ‘x’ and ‘y’”. You would think that they don’t understand what is actually wrong with it, or why someone might mistakenly believe it. This is basically my reaction to everyone I have seen oppose the censorship because of some argument they present that the idea is wrong and no one would believe it.
I’m actually not sure if I understand your point. Either it is a round-about way of making it or I’m totally dense and the idea really is dangerous (or some third option).
It’s not that the idea is wrong and no one would believe it, it’s that the idea is wrong and when presented with with the explanation for why it’s wrong no one should believe it. In addition, it’s kind of important that people understand why it’s wrong. I’m sympathetic to people with different minds that might have adverse reactions to things I don’t but the solution to that is to warn them off, not censor the topics entirely.
Yes, the idea really is dangerous.
And for those who understand the idea, but not why it is wrong, nor the explanation of why it is wrong?
This is a politically reinforced heuristic that does not work for this problem.
Transparency is very important regarding people and organisations in powerful and unique positions. The way they act and what they claim in public is weak evidence in support of their honesty. To claim that they have to censor certain information in the name of the greater public good, and to fortify the decision based on their public reputation, does bear no evidence about their true objectives. The only way to solve this issue is by means of transparency.
Surely transparency might have negative consequences, but it mustn’t and can outweigh the potential risks from just believing that certain people are telling the truth and do not engage in deception to follow through on their true objectives.
There is also nothing that Yudkowsky has ever achieved that would sufficiently prove his superior intellect that would in turn justify people to just believe him about some extraordinary claim.
When I say something is a misapplied politically reinforced heuristic, you only reinforce my point by making fully general political arguments that it is always right.
Censorship is not the most evil thing in the universe. The consequences of transparency are allowed to be worse than censorship. Deal with it.
I already had Anna Salamon telling me something about politics. You sound as incomprehensible to me. Sorry, not meant as an attack.
I stated several times in the past that I am completely in favor of censorship, I have no idea why you are telling me this.
Our rules and intuitions about free speech and censorship are based on the types of censorship we usually see in practice. Ordinarily, if someone is trying to censor a piece of information, then that information falls into one of two categories: either it’s information that would weaken them politically, by making others less likely to support them and more likely to support their opponents, or it’s information that would enable people to do something that they don’t want done.
People often try to censor information that makes people less likely to support them, and more likely to support their opponents. For example, many governments try to censor embarrassing facts (“the Purple Party takes bribes and kicks puppies!”), the fact that opposition exists (“the Pink Party will stop the puppy-kicking!”) and its strength (“you can join the Pink Party, there are 10^4 of us already!”), and organization of opposition (“the Pink Party rally is tomorrow!”). This is most obvious with political parties, but it happens anywhere people feel like there are “sides”—with religions (censorship of “blasphemy”) and with public policies (censoring climate change studies, reports from the Iraq and Afghan wars). Allowing censorship in this category is bad because it enables corruption, and leaves less-worthy groups in charge.
The second common instance of censorship is encouragement and instructions for doing things that certain people don’t want done. Examples include cryptography, how to break DRM, pornography, and bomb-making recipes. Banning these is bad if the capability is suppressed for a bad reason (cryptography enables dissent), if it’s entangled with other things (general-purpose chemistry applies to explosives), or if it requires infrastructure that can also be used for the first type of censorship (porn filters have been caught blocking politicians’ campaign sites).
These two cases cover 99.99% of the things we call “censorship”, and within these two categories, censorship is definitely bad, and usually worth opposing. It is normally safe to assume that if something is being censored, it is for one of these two reasons. There are gray areas—slander (when the speaker knows he’s lying and has malicious intent), and bomb-making recipes (when they’re advertised as such and not general-purpose chemistry), for example—but the law has the exceptions mapped out pretty accurately. (Slander gets you sued, bomb-making recipes get you surveilled.) This makes a solid foundation for the principle that censorship should be opposed.
However, that principle and the analysis supporting it apply only to censorship that falls within these two domains. When things fall outside these categories, we usually don’t call them censorship; for example, there is a widespread conspiracy among email and web site administrators to suppress ads for Viagra, but we don’t call that censorship, even though it meets every aspect of the definition except motive. If you happen to find a weird instance of censorship which doesn’t fall into either category, then you have to start over and derive an answer to whether censorship in that particular case is good or bad, from scratch, without resorting to generalities about censorship-in-general. Some of the arguments may still apply—for example, building a censorship-technology infrastructure is bad even if it’s only meant to be used on spam—but not all of them, and not with the same force.
If the usual arguments against censorship don’t apply, and we’re trying to figure out whether to censor it, the next two things to test are whether it’s true, and whether an informed reader would want to see it. If both of these conditions hold, then it should not be censored. However, if either condition fails to hold, then it’s okay to censor.
Either the forbidden post is false, in which case it does not deserve protection because it’s false, or it’s true, in which case it should be censored because no informed person should want to see it. In either case, people spreading it are doing a bad thing.
Even if this is right the censorship extends to perhaps true conversations about why the post is false. Moreover, I don’t see what truth has to do with it. There are plenty of false claims made on this site that nonetheless should be public because understanding why they’re false and how someone might come to think that they are true are worthwhile endeavors.
The question here is rather straight forward: does the harm of the censorship outweigh the harm of letting people talk about the post. I can understand how you might initially think those who disagree with you are just responding to knee-jerk anti-censorship instincts that aren’t necessarily valid here. But from where I stand the arguments made by those who disagree with you do not fit this pattern. I think XiXi has been clear in the past about why the transparency concern does apply to SIAI. We’ve also seen arguments for why censorship in this particular case is a bad idea.
There are clearly more than two options here. There seem to be two points under contention:
It is/is not (1/2) reasonable to agree with the forbidden post.
It is/is not (3/4) desirable to know the contents of the forbidden post.
You seem to be restricting us to either 2+3 or 1+4. It seems that 1+3 is plausible (should we keep children from ever knowing about death because it’ll upset them?), and 2+4 seems like a good argument for restriction of knowledge (the idea is costly until you work through it, and the benefits gained from reaching the other side are lower than the costs).
But I personally suspect 2+3 is the best description, and that doesn’t explain why people trying to spread it are doing a bad thing. Should we delete posts on Pascal’s Wager because someone might believe it?
Excluded middle, of course: incorrect criterion. (Was this intended as a test?) It would not deserve protection if it were useless (like spam), not “if it were false.”
The reason I consider sufficient to keep it off LessWrong is that it actually hurt actual people. That’s pretty convincing to me. I wouldn’t expunge it from the Internet (though I might put a warning label on it), but from LW? Appropriate. Reposting it here? Rude.
Unfortunately, that’s also an argument as to why it needs serious thought applied to it, because if the results of decompartmentalised thinking can lead there, humans need to be able to handle them. As Vaniver pointed out, there are previous historical texts that have had similar effects. Rationalists need to be able to cope with such things, as they have learnt to cope with previous conceptual basilisks. So it’s legitimate LessWrong material at the same time as being inappropriate for here. Tricky one.
(To the ends of that “compartmentalisation” link, by the way, I’m interested in past examples of basilisks and other motifs of harmful sensation in idea form. Yes, I have the deleted Wikipedia article.)
Note that I personally found the idea itself silly at best.
The assertion that if a statement is not true, fails to alter political support, fails to provide instruction, and an informed reader wants to see that statement, it is therefore a bad thing to spread that statement and a OK thing to censor, is, um, far from uncontroversial.
To begin with, most fiction falls into this category. For that matter, so does most nonfiction, though at least in that case the authors generally don’t intend for it to be non-true.
No, you reversed a sign bit: it is okay to censor if an informed reader wouldn’t want to see it (and the rest of those conditions).
No, I don’t think so. You said “if either condition fails to hold, then it’s okay to censor.” If it isn’t true, and an informed reader wants to see it, then one of the two conditions failed to hold, and therefore it’s OK to censor.
No?
Oops, you’re right—one more condition is required. The condition I gave is only sufficient to show that it fails to fall into a protected class, not that it falls in the class of things that should be censored; there are things which fall in neither class (which aren’t normally censored because that requires someone with a motive to censor it, which usually puts it into one of the protected classes). To make it worthy of censorship, there must additionally be a reason outside the list of excluded reasons to censor it.
Your comment that I am replying too is often way more salient than things you have said in the past that I may or may not have observed.
I just have trouble understanding what you are saying. That might very well be my fault. I do not intent any hostile attack against you or the SIAI. I’m just curious, not worried at all. I do not demand anything. I’d like to learn more about you people, what you believe and how you arrived at your beliefs.
There is this particular case of the forbidden topic and I am throwing everything I got at it to see if the beliefs about it are consistent and hold water. That doesn’t mean that I am against censorship or that I believe it is wrong. I believe it is right but too unlikely (...). I believe that Yudkowsky and the SIAI are probably honest (although my gut feeling is to be very skeptic) but that there are good arguments for more transparency regarding the SIAI (if you believe it is as important as being portrayed). I believe that Yudkowsky is wrong about his risk estimation regarding the idea.
I just don’t understand your criticism of my past comments and that included telling me something about how I use politics (I don’t get it) and that I should accept that censorship sometimes is necessary (which I haven’t argued against).
You are just going to piss off the management.
IMO, it isn’t that interesting.
Yudkowsky apparently agrees that squashing it was handled badly.
Anyway, now Roko is out of self-imposed exile, I figure it is about time to let it drop.
The problem with that is that Eliezer and those who agree with him, including me, cannot speak freely about our reasoning on the issue, because we don’t want to spread the idea, so we don’t want to describe it and point to details about it as we describe our reasoning. If you imagine yourself in our position, believing the idea is dangerous, you could tell that you wouldn’t want to spread the idea in the process of explaining its danger either.
Under more normal circumstances, where the ideas we disagree about are not thought by anyone to be dangerous, we can have effective discussion by laying out our true reasons for our beliefs, and considering counter arguments that refer to the details of our arguments. Being cut off from our normal effective methods of discussion is stressful, at least for me.
I have been trying to persuade people who don’t know the details of the idea or don’t agree that it is dangerous that we do in fact have good reasons for believing it to be dangerous, or at least that this is likely enough that they should let it go. This is a slow process, as I think of ways to express my thoughts without revealing details of the dangerous idea, or explaining them to people who know but don’t understand those details. And this ends up involving talking to people who, because they don’t think the idea is dangerous and don’t take it seriously, express themselves faster and less carefully, and who have conflicting goals like learning or spreading the idea, or opposing censorship in general, or having judged for themselves the merits of censorship (from others just like them) in this case. This is also stressful.
I engage in this stressful topic, because I think it is important, both that people do not get hurt from learning about this idea, and that SIAI/Eliezer do not get dragged through mud for doing the right thing.
Sorry, but I am not here to help you get the full understanding you need to judge if the beliefs are consistent and hold water. As I have been saying, this is not a normal discussion. And seriously, you would be better of dropping it and finding something else to worry about. And if you think it is important, you can remember to track if SIAI/Eliezer/supporters like me engage in a pattern of making excuses to ban certain topics to protect some hidden agenda. But then please remember all the critical discussion that don’t get banned.
Note that this shouldn’t be possible other than through arguments from authority.
(I’ve just now formed a better intuitive picture of the reasons for danger of the idea, and saw some of the comments previously made unnecessarily revealing, where the additional detail didn’t actually serve the purpose of convincing people I communicated with, who lacked some of the prerequisites for being able to use that detail to understand the argument for danger, but would potentially gain (better) understanding of the idea. It does still sound silly to me, but maybe the lack of inferential stability of this conclusion should actually be felt this way—I expect that the idea will stop being dangerous in the following decades due to better understanding of decision theory.)
Does this theory of yours require that Eliezer Yudkowsky plus several other old-time Less Wrongians are holding the Idiot Ball and being really stupid about something that you can just see as obvious?
Now might be a good time to notice that you are confused.
Something to keep in mind when you reply to comments here is that you are the default leader of this community and its highest status member. This means comments that would be reasonably glib or slightly snarky from other posters can come off as threatening and condescending when made by you. They’re not really threatening but they can instill in their targets strong fight-or-flight responses. Perhaps this is because in the ancestral environment status challenges from group leaders were far more threatening to our ancestor’s livelihood than challenges from other group members. When you’re kicking out trolls it’s a sight to see, but when you’re rhetorically challenging honest interlocutors it’s probably counter-productive. I had to step away from the computer because I could tell that even if I was wrong the feelings this comment provoked weren’t going to let me admit it (and you weren’t even actually mean, just snobby).
As to your question, I don’t think my understanding of the idea requires anyone to be an idiot. In fact from what you’ve said I doubt we’re that far a part on the matter of how threatening the idea is. There may be implications I haven’t thought through that you have and there maybe general responses to implications I’ve thought of that you haven’t. I often have trouble telling how much intelligence I needed to get somewhere but I think I’ve applied a fair amount in this case. Where I think we probably diverge significantly is in our estimation of the cost of the censorship which I think is more than high enough to outweigh the risk of making Roko’s idea public. It is at least plausible that you are underestimating this cost due to biases resulting from you social position in this group and your organizational affiliation.
I’ll note that, as wedrifid suggested, your position also seems to assume that quite a few Less Wrongians are being really stupid and can’t see the obvious. Perhaps those who have expressed disagreement with your decision aren’t quite as old-time as those who have. And perhaps this is because we have not internalized important concepts or accessed important evidence required to see the danger in Roko’s idea. But it is also noteworthy that the people who have expressed disagreement have mostly been outside the Yudkowsky/SIAI cluster relative to those who have agreed with you. This suggests that they might be less susceptible to the biases that may be affecting your estimation of the cost of the censorship.
I am a bit confused as I’m not totally sure the explanations I’ve thought of or seen posted for your actions sufficiently explain them- but that’s just the kind of uncertainty one always expects in disagreements. Are you not confused? If I didn’t think there was a downside to the censorship I would let it go. But I think the downside is huge, in particular I think the censorship makes it much harder to get more people to take Friendliness seriously as a scholarly field by people beyond the SIAI circle. I’m not sure you’re humble enough to care about that (that isn’t meant as a character attack btw). It makes the field look like a joke and makes its leading scholar look ridiculous. I’m not sure you have the political talents to recognize that. It also slightly increases the chances of someone not recognizing this failure mode (the one in Roko’s post) when it counts. I think you might be so sure (or so focused on the possibility that) you’re going to be the one flipping the switch in that situation that you aren’t worried enough about that.
Repeating “But I say so!” with increasing emphasis until it works. Been taking debating lessons from Robin?
It seems to me that the natural effect of a group leader persistently arguing from his own authority is Evaporative Cooling of Group Beliefs. This is of course conducive to confirmation bias and corresponding epistemological skewing for the leader; things which seem undesirable for somebody in Eliezer’s position. I really wish that Eliezer was receptive to taking this consideration seriously.
The thing is he usually does. That is one thing that has in the past set Eliezer apart from Robin and impressed me about Eliezer. Now it is almost as though he has embraced the evaporative cooling concept as an opportunity instead of a risk and gone and bought himself a blowtorch to force the issue!
Huh, so there was a change? Curious. Certainly looking over some of Eliezer’s past writings there are some that I identify with a great deal.
Far be it from me to be anything but an optimist. I’m going with ‘exceptions’. :)
Maybe, given the credibility he has accumulated on all these other topics, you should be willing to trust him on the one issue on which he is asserting this authority and on which it is clear that if he is right, it would be bad to discuss his reasoning.
The well known (and empirically verified) weakness in experts of the human variety is that they tend to be systematically overconfident when it comes to judgements that fall outside their area of exceptional performance—particularly when the topic is one just outside the fringes.
When it comes to blogging about theoretical issues of rationality Eliezer is undeniably brilliant. Yet his credibility specifically when it comes to responding to risks is rather less outstanding. In my observation he reacts emotionally and starts making rookie mistakes of rational thought and action. To the point when I’ve very nearly responded ‘Go read the sequences!’ before remembering that he was the flipping author and so should already know better.
Also important is the fact that elements of the decision are about people, not game theory. Eliezer hopefully doesn’t claim to be an expert when it comes to predicting or eliciting optimal reactions in others.
We were talking about his credibility in judging whether this idea is a risk, and that is within his area of expertise.
Was it not clear that I do not assign particular credence to Eliezer when it comes to judging risks? I thought I expressed that with considerable emphasis.
I’m aware that you disagree with my conclusions—and perhaps even my premises—but I can assure you that I’m speaking directly to the topic.
I do not consider this strong evidence as there are many highly intelligent and productive people who hold crazy beliefs:
Francisco J. Ayala who “…has been called the “Renaissance Man of Evolutionary Biology” is a geneticist ordained as a Dominican priest. “His “discoveries have opened up new approaches to the prevention and treatment of diseases that affect hundreds of millions of individuals worldwide…”
Francis Collins (geneticist, Human Genome Project) noted for his landmark discoveries of disease genes and his leadership of the Human Genome Project (HGP) and described by the Endocrine Society as “one of the most accomplished scientists of our time” is a evangelical Christian.
Peter Duesberg (a professor of molecular and cell biology at the University of California, Berkeley) claimed that AIDS is not caused by HIV, which made him so unpopular that his colleagues and others have — until recently — been ignoring his potentially breakthrough work on the causes of cancer.
Georges Lemaître (a Belgian Roman Catholic priest) proposed what became known as the Big Bang theory of the origin of the Universe.
Kurt Gödel (logician, mathematician and philosopher) who suffered from paranoia and believed in ghosts. “Gödel, by contrast, had a tendency toward paranoia. He believed in ghosts; he had a morbid dread of being poisoned by refrigerator gases; he refused to go out when certain distinguished mathematicians were in town, apparently out of concern that they might try to kill him.”
Mark Chu-Carroll (PhD Computer Scientist, works for Google as a Software Engineer) “If you’re religious like me, you might believe that there is some deity that created the Universe.” He is running one of my favorite blogs, Good Math, Bad Math, and writes a lot on debunking creationism and other crackpottery.
Nassim Taleb (the author of the 2007 book (completed 2010) The Black Swan) does believe: Can’t track reality with science and equations. Religion is not about belief. We were wiser before the Enlightenment, because we knew how to take knowledge from incomplete information, and now we live in a world of epistemic arrogance. Religious people have a way of dealing with ignorance, by saying “God knows”.
Kevin Kelly (editor) is a devout Christian. Writes pro science and technology essays.
William D. Phillips (Nobel Prize in Physics 1997) is a Methodist.
I could continue this list with people like Ted Kaczynski or Roger Penrose. I just wanted show that intelligence and rational conduct do not rule out the possibility of being wrong about some belief.
Taleb quote doesn’t qualify. (I won’t comment on others.)
I should have made more clearly that it is not my intention to indicate that I believe that those people, or crazy ideas in general, are wrong. But there are a lot of smart people out there who’ll advocate opposing ideas. Using their reputation of being highly intelligent to follow through on their ideas is in my opinion not a very good idea in itself. I could just believe Freeman Dyson that existing simulation models of climate contain too much error to reliably predict future trends. I could believe Peter Duesberg that HIV does not cause aids, after all he is a brilliant molecular biologist. But I just do not think that any amount of reputation is enough evidence to believe extraordinary claims uttered by such people. And in the case of Yudkowsky, there doesn’t even exist much reputation and no great achievements at all that would justify some strong belief in his infallibility. What there exists in Yudkowsky’s case seems to be strong emotional commitment. I just can’t tell if he is honest. If he really believes that he’s working on a policy for some future superhuman intelligence that will rule the universe, then I’m going to be very careful. Not because it is wrong, but because such beliefs imply huge payoffs. Not that I believe he is the disguised Dr. Evil, but can we be sure enough to just trust him with it? Censorship of certain ideas does bear more evidence against him as it does in favor of his honesty.
How extensively have you searched for experts who made correct predictions outside their fields of expertise? What would you expect to see if you just searched for experts making predictions outside their field of expertise and then determined if that prediction were correct? What if you limited your search to experts who had expressed the attitude Eliezer expressed in Outside the Laboratory?
“Rule out”? Seriously? What kind of evidence is it?
You extracted the “rule out” phrase from the sentence:
From within the common phrase ‘do not rule out the possibility’ no less!
You then make a reference to ‘0 and 1s not probabilities’ with exaggerated incredulity.
To put it mildly this struck me as logically rude and in general poor form. XiXiDu deserves more courtesy.
None of this affects my point that ruling out the possibility is the wrong, (in fact impossible), standard.
Not exaggerated. XiXiDu’s post did seem to be saying: here are these examples of experts being wrong so it is possible that an expert is wrong in this case, without saying anything useful about how probable it is for this particular expert to be wrong on this particular issue.
You have made an argument accusing me of logical rudeness that, quite frankly, does not stand up to scrutiny.
-
Better evidence than I’ve ever seen in support of the censored idea. I have these well-founded principles, free speech and transparency, and weigh them against the evidence I have in favor of censoring the idea. That evidence is merely 1.) Yudkowsky’s past achievements, 2.) his output and 3.) intelligence. That intelligent people have been and are wrong about certain ideas while still being productive and right about many other ideas is evidence to weaken #3. That people lie and deceive to get what they want is evidence against #1 and #2 and in favor of transparency and free speech, which are both already more likely to have a positive impact than the forbidden topic is to have a negative impact.
And what are you trying to tell me with this link? I haven’t seen anyone stating numeric probability estimations regarding the forbidden topic. And I won’t state one either, I’ll just say that it is subjectively improbable enough to ignore it because there are possible too many very-very-low-probability events to take into account (for every being that will harm me if I don’t do X there is another being that will harm me if I do X, which cancel out each other). But if you’d like to pull some number out of thin air, go ahead. I won’t because I don’t have enough data to even calculate the probability of AI going FOOM versus a slow development.
You have failed to address my criticisms of you points, that you are seeking out only examples that support your desired conclusion, and that you are ignoring details that would allow you to construct a narrower, more relevant reference class for your outside view argument.
I was telling you the “ruling out the possibility” is the wrong, (in fact impossible), standard.
Only now I understand your criticism. I do not seek out examples to support my conclusion but to weaken your argument that one should trust Yudkowsky because of his previous output. I’m aware that Yudkowsky can very well be right about the idea but do in fact believe that the risk is worth taking. Have I done extensive research on how often people in similar situations have been wrong? Nope. No excuses here, but do you think there are comparable cases of predictions that proved to be reliable? And how much research have you done in this case and about the idea in general?
I don’t, I actually stated a few times that I do not think that the idea is wrong.
Seeking out just examples that weaken my argument, when I never predicted that no such examples would exist, is the problem I am talking about.
What made you think that supporting your conclusion and weakening my argument are different things?
My reason to weaken your argument is not that I want to be right but that I want feedback about my doubts. I said that 1.) people can be wrong, regardless of their previous reputation, 2.) that people can lie about their objectives and deceive by how they act in public (especially when the stakes are high), 3.) that Yudkowsky’s previous output and achievements are not remarkable enough to trust him about some extraordinary claim. You haven’t responded on why you tell people to believe Yudkowsky, in this case, regardless of my objections.
I’m sorry if I made it appear as if I hold some particular belief. My epistemic state simply doesn’t allow me to arrive at your conclusion. To highlight this I argued in favor of what it would mean to not accept your argument, namely to stand to previously well-established concepts like free speech and transparency. Yes, you could say that there is no difference here, except that I do not care about who is right but what is the right thing to do.
Still, it’s incorrect to argue from existence of examples. You have to argue from likelihood. You’d expect more correctness from a person with reputation for being right than from a person with reputation for being wrong.
People can also go crazy, regardless of their previous reputation, but it’s improbable, and not an adequate argument for their craziness.
And you need to know what fact you are trying to convince people about, not just search for soldier-arguments pointing in the preferred direction. If you believe that the fact is that a person is crazy, you too have to recognize that “people can be crazy” is inadequate argument for this fact you wish to communicate, and that you shouldn’t name this argument in good faith.
(Craziness is introduced as a less-likely condition than wrongness to stress the structure of my argument, not to suggest that wrongness is as unlikely.)
I notice that Yudkowsky wasn’t always self-professed human-friendly. Consider this:
http://hanson.gmu.edu/vc.html#yudkowsky
Wow. That is scary. Do you have an estimated date on that bizarre declaration? Pre 2004 I assume?
He’s changed his mind since. That makes it far, far less scary.
(Parenthetical about how changing your mind, admitting you were wrong, oops, etc, is a good thing).
(Hence reference to Eliezier2004 sequence.)
He has changed his mind about one technical point in meta-ethics. He now realizes that super-human intelligence does not automatically lead to super-human morality. He is now (IMHO) less wrong. But he retains a host of other (mis)conceptions about meta-ethics which make his intentions abhorrent to people with different (mis)conceptions. And he retains the arrogance that would make him dangerous to those he disagrees with, if he were powerful.
″… far, far less scary”? You are engaging in wishful thinking no less foolish than that for which Eliezer has now repented.
I’m not at all sure that I agree with Eliezer about most meta-ethics, and definitely disagree on some fairly important issues. But, that doesn’t make his views necessarily abhorrent. If Eliezer triggers a positive Singularity (positive in the sense that it reflects what he wants out of a Singularity, complete with CEV), I suspect that that will be a universe which I won’t mind living in. People can disagree about very basic issues and still not hate each others’ intentions. They can even disagree about long-term goals and not hate it if the other person’s goals are implemented.
Have you ever have one of those arguments with your SO in which:
It is conceded that your intentions were good.
It is conceded that the results seem good.
The SO is still pissed because of the lack of consultation and/or presence of extrapolation?
I usually escape those confrontations by promising to consult and/or not extrapolate the next time. In your scenario, Eliezer won’t have that option.
When people point out that Eliezer’s math is broken because his undiscounted future utilities leads to unbounded utility, his response is something like “Find better math—discounted utility is morally wrong”.
When Eliezer suggests that there is no path to a positive singularity which allows for prior consultation with the bulk of mankind, my response is something like “Look harder. Find a path that allows people to feel that they have given their informed consent to both the project and the timetable—anything else is morally wrong.”
ETA: In fact, I would like to see it as a constraint on the meaning of the word “Friendly” that it must not only provide friendly consequences, but also, it must be brought into existence in a friendly way. I suspect that this is one of those problems in which the added constraint actually makes the solution easier to find.
Could you link to where Eliezer says that future utilities should not be discounted? I find that surprising, since uncertainty causes an effect roughly equivalent to discounting.
I would also like to point out that achieving public consensus about whether to launch an AI would take months or years, and that during that time, not only is there a high risk of unfriendly AIs, it is also guaranteed that millions of people will die. Making people feel like they were involved in the decision is emphatically not worth the cost
He makes the case in this posting. It is a pretty good posting, by the way, in which he also points out some kinds of discounting which he believes are justified. This posting does not purport to be a knock-down argument against discounting future utility—it merely states Eliezer’s reasons for remaining unconvinced that you should discount (and hence for remaining in disagreement with most economic thinkers).
ETA: One economic thinker who disagrees with Eliezer is Robin Hanson. His response to Eliezer’s posting is also well worth reading.
Examples of Eliezer conducting utilitarian reasoning about the future without discounting are legion.
Tim Tyler makes the same assertion about the effects of uncertainty. He backs the assertion with metaphor, but I have yet to see a worked example of the math. Can you provide one?
Of course, one obvious related phenomenon—it is even mentioned with respect in Eliezer’s posting—is that the value of a promise must be discounted with time due to the increasing risk of non-performance: my promise to scratch your back tomorrow is more valuable to you than my promise to scratch next week—simply because there is a risk that you or I will die in the interim, rendering the promise worthless. But I don’t see how other forms of increased uncertainty about the future should have the same (exponential decay) response curve.
So, start now.
Most tree-pruning heuristics naturally cause an effect like temporal discounting. Resource limits mean that you can’t calculate the whole future tree—so you have to prune. Pruning normally means applying some kind of evaluation function early—to decide which branches to prune. The more you evaluate early, the more you are effectively valuing the near-present.
That is not maths—but hopefully it has a bit more detail than previously.
It doesn’t really address the question. In the A* algorithm the heuristic estimates of the objective function are supposed to be upper bounds on utility, not lower bounds. Furthermore, they are supposed to actually estimate the result of the complete computation—not to represent a partial computation exactly.
Reality check: a tree of possible futures is pruned at points before the future is completely calculated. Of course it would be nice to apply an evaluation function which represents the results of considering all possible future branches from that point on. However, getting one of those that produces results in a reasonable time would be a major miracle.
If you look at things like chess algorithms, they do some things to get a more accurate utility valuation when pruning—such as check for quiescence. However, they basically just employ a standard evaluation at that point—or sometimes a faster, cheaper approximation. If is sufficiently bad, the tree gets pruned.
We are living in the same reality. But the heuristic evaluation function still needs to be an estimate of the complete computation, rather than being something else entirely. If you want to estimate your own accumulation of pleasure over a lifetime, you cannot get an estimate of that by simply calculating the accumulation of pleasure over a shorter period—otherwise no one would undertake the pain of schooling motivated by the anticipated pleasure of high future income.
The question which divides us is whether an extra 10 utils now is better or worse than an additional 11 utils 20 years from now. You claim that it is worse. Period. I claim that it may well be better, depending on the discount rate.
Correct me if I’m missing an important nuance, but isn’t this just about whether one’s utils are timeless?
I’m not sure I understand the question. What does it mean for a util to be ‘timeless’?
ETA: The question of the interaction of utility and time is a confusing one. In “Against Discount Rates”, Eliezer writes:
I think that Eliezer has expressed the issue in almost, but not quite, the right way. The right question is whether a decision maker in 2007 should be 5% more interested in doing something about the 2008 issue than about the 2009 issue. I believe that she should be. If only because she expects that she will have an entire year in the future to worry about the 2009 family without the need to even consider 2008 again. 2008′s water will be already under the bridge.
I’m sure someone else can explain this better than me, but: As I understand it, a util understood timelessly (rather than like money, which there are valid reasons to discount because it can be invested, lost, revalued, etc. over time) builds into how it’s counted all preferences, including preferences that interact with time. If you get 10 utils, you get 10 utils, full stop. These aren’t delivered to your door in a plain brown wrapper such that you can put them in an interest-bearing account. They’re improvements in the four-dimensional state of the entire universe over all time, that you value at 10 utils. If you get 11 utils, you get 11 utils, and it doesn’t really matter when you get them. Sure, if you get them 20 years from now, then they don’t cover specific events over the next 20 years that could stand improvement. But it’s still worth eleven utils, not ten. If you value things that happen in the next 20 years more highly than things that happen later, then utils according to your utility function will reflect that, that’s all.
That (timeless utils) is a perfectly sensible convention about what utility ought to mean. But, having adopted that convention, we are left with (at least) two questions:
Do I (in 2011) derive a few percent more utility from an African family having clean water in 2012 than I do from an equivalent family having clean water in 2013?
If I do derive more utility from the first alternative, am I making a moral error in having a utility function that acts that way?
I would answer yes to the first question. As I understand it, Eliezer would answer yes to the second question and would answer no to the first, were he in my shoes. I would claim that Eliezer is making a moral error in both judgments.
Do you (in the years 2011, 2012, 2013, 2014) derive different relative utilities for these conditions? If so, it seems you have a problem.
I’m sorry. I don’t know what is meant by utility derived in 2014 from an event in 2012. I understand that the whole point of my assigning utilities in 2014 is to guide myself in making decisions in 2014. But no decision I make in 2014 can have an effect on events in 2012. So, from a decision-theoretic viewpoint, it doesn’t matter how I evaluate the utilities of past events. They are additive constants (same in all decision branches) in any computation of utility, and hence are irrelevant.
Or did you mean to ask about different relative utilities in the years before 2012? Yes, I understand that if I don’t use exponential discounting, then I risk inconsistencies.
And that is a fact about 2007 decision maker, not 2008 family’s value as compared to 2009 family.
If, in 2007, you present me with a choice of clean water for a family for all of and only 2008 vs 2009, and you further assure me that these families will otherwise survive in hardship, and that their suffering in one year won’t materially affect their next year, and that I won’t have this opportunity again come this time next year, and that flow-on or snowball effects which benefit from an early start are not a factor here—then I would be indifferent to the choice.
If I would not be; if there is something intrinsic about earlier times that makes them more valuable, and not just a heuristic of preferring them for snowballing or flow-on reasons, then that is what Eliezer is saying seems wrong.
I would classify that as instrumental discounting. I don’t think anyone would argue with that—except maybe a superintelligence who has already exhausted the whole game tree—and for whom an extra year buys nothing.
Given that you also believe that distributing your charitable giving over many charities is ‘risk management’, I suppose that should not surprise me.
FWIW, I genuinely don’t understand your perspective. The extent to which you discount the future depends on your chances of enjoying it—but also on factors like your ability to predict it—and your ability to influence it—the latter are functions of your abilities, of what you are trying to predict and of the current circumstances.
You really, really do not normally want to put those sorts of things into an agent’s utility function. You really, really do want to calculate them dynamically, depending on the agent’s current circumstances, prediction ability levels, actuator power levels, previous experience, etc.
Attempts to put that sort of thing into the utility function would normally tend to produce an inflexible agent, who has more difficulties in adapting and improving. Trying to incorporate all the dynamic learning needed to deal with the issue into the utility function might be possible in principle—but that represents a really bad idea.
Hopefully you can see my reasoning on this issue. I can’t see your reasoning, though. I can barely even imagine what it might possibly be.
Maybe you are thinking that all events have roughly the same level of unpredictability in the future, and there is roughly the same level of difficulty in influencing them, so the whole issue can be dealt with by one (or a small number of) temporal discounting “fudge factors”—and that evoution built us that way because it was too stupid to do any better.
You apparently denied that resource limitation results in temporal discounting. Maybe that is the problem (if so, see my other reply here). However, now you seem to have acknowledged that an extra year of time to worry in helps with developing plans. What I can see doesn’t seem to make very much sense.
I really, really am not advocating that we put instrumental considerations into our utility functions. The reason you think I am advocating this is that you have this fixed idea that the only justification for discounting is instrumental. So every time I offer a heuristic analogy explaining the motivation for fundamental discounting, you interpret it as a flawed argument for using discounting as a heuristic for instrumental reasons.
Since it appears that this will go on forever, and I don’t discount the future enough to make the sum of this projected infinite stream of disutility seem small, I really ought to give up. But somehow, my residual uncertainty about the future makes me think that you may eventually take Cromwell’s advice.
To clarify: I do not think the only justification for discounting is instrumental. My position is more like: agents can have whatever utility functions they like (including ones with temporal discounting) without having to justify them to anyone.
However, I do think there are some problems associated with temporal discounting. Temporal discounting sacrifices the future for the sake of the present. Sometimes the future can look after itself—but sacrificing the future is also something which can be taken too far.
Axelrod suggested that when the shadow of the future grows too short, more defections happen. If people don’t sufficiently value the future, reciprocal altruism breaks down. Things get especially bad when politicians fail to value the future. We should strive to arrange things so that the future doesn’t get discounted too much.
Instrumental temporal discounting doesn’t belong in ultimate utility functions. So, we should figure out what temporal discounting is instrumental and exclude it.
If we are building a potentially-immortal machine intelligence with a low chance of dying and which doesn’t age, those are more causes of temporal discounting which could be discarded as well.
What does that leave? Not very much, IMO. The machine will still have some finite chance of being hit by a large celestial body for a while. It might die—but its chances of dying vary over time; its degree of temporal discounting should vary in response—once again, you don’t wire this in, you let the agent figure it out dynamically.
The point is that resource limitation makes these estimates bad estimates—and you can’t do better by replacing them with better estimates because of … resource limitation!
To see how resource limitation leads to temporal discounting, consider computer chess. Powerful computers play reasonable games—but heavily resource limited ones fall for sacrifice plays, and fail to make successful sacrifice gambits. They often behave as though they are valuing short-term gain over long term results.
A peek under the hood quickly reveals why. They only bother looking at a tiny section of the game tree near to the current position! More powerful programs can afford to exhaustively search that space—and then move on to positions further out. Also the limited programs employ “cheap” evaluation functions that fail to fully compensate for their short-term foresight—since they must be able to be executed rapidly. The result is short-sighted chess programs.
That resource limitation leads to temporal discounting is a fairly simple and general principle which applies to all kinds of agents.
Why do you keep trying to argue against discounting using an example where discounting is inappropriate by definition? The objective in chess is to win. It doesn’t matter whether you win in 5 moves or 50 moves. There is no discounting. Looking at this example tells us nothing about whether we should discount future increments of utility in creating a utility function.
Instead, you need to look at questions like this: An agent plays go in a coffee shop. He has the choice of playing slowly, in which case the games each take an hour and he wins 70% of them. Or, he can play quickly, in which case the games each take 20 minutes, but he only wins 60% of them. As soon as one game finishes, another begins. The agent plans to keep playing go forever. He gains 1 util each time he wins and loses 1 util each time he loses.
The main decision he faces is whether he maximizes utility by playing slowly or quickly. Of course, he has infinite expected utility however he plays. You can redefine the objective to be maximizing utility flow per hour and still get a ‘rational’ solution. But this trick isn’t enough for the following extended problem:
The local professional offers go lessons. Lessons require a week of time away from the coffee-shop and a 50 util payment. But each week of lessons turns 1% of your losses into victories. Now the question is: Is it worth it to take lessons? How many weeks of lessons are optimal? The difficulty here is that we need to compare the values of a one-shot (50 utils plus a week not playing go) with the value of an eternal continuous flow (the extra fraction of games per hour which are victories rather than losses). But that is an infinite utility payoff from the lessons, and only a finite cost, right? Obviously, the right decision is to take a week of lessons. And then another week after that. And so on. Forever.
Discounting of future utility flows is the standard and obvious way of avoiding this kind of problem and paradox. But now let us see whether we can alter this example to capture your ‘instrumental discounting due to an uncertain future’:
First, the obvious one. Our hero expects to die someday, but doesn’t know when. He estimates a 5% chance of death every year. If he is lucky, he could live for another century. Or he could keel over tomorrow. And when he dies, the flow of utility from playing go ceases. It is very well known that this kind of uncertainty about the future is mathematically equivalent to discounted utility in a certain future. But you seemed to be suggesting something more like the following:
Our hero is no longer certain what his winning percentage will be in the future. He knows that he experiences microstrokes roughly every 6 months, and that each incident takes 5% of his wins and changes them to losses. On the other hand, he also knows that roughly every year he experiences a conceptual breakthrough. And that each such breakthrough takes 10% of his losses and turns them into victories.
Does this kind of uncertainty about the future justify discounting on ‘instrumental grounds’? My intuition says ’No, not in this case, but there are similar cases in which discounting would work.” I haven’t actually done the math, though, so I remain open to instruction.
Temporal discounting is about valuing something happening today more than the same thing happening tomorrow.
Chess computers do, in fact discount. That is why they do prefer to mate you in twenty moves rather than a hundred.
The values of a chess computer do not just tell it to win. In fact, they are complex—e.g. Deep Blue had an evaluation function that was split into 8,000 parts.
Operation consists of maximising the utility function, after foresight and tree pruning. Events that take place in branches after tree pruning has truncated them typically don’t get valued at all—since they are not forseen. Resource-limited chess computers can find themselves preferring to promote a pawn sooner rather than later. They do so since they fail to see the benefit of sequences leading to promotion later.
So: we apparently agree that resource limitation leads to indifference towards the future (due to not bothering to predict it) - but I classify this as a kind of temporal discounting (since rewards in the future get ignored), wheras you apparently don’t.
Hmm. It seems as though this has turned out to be a rather esoteric technical question about exactly which set of phenomena the term “temporal discounting” can be used to refer to.
Earlier we were talking about whether agents focussed their attention on tomorrow—rather than next year. Putting aside the issue of whether that is classified as being “temporal discounting”—or not—I think the extent to which agents focus on the near-future is partly a consequence of resource limitation. Give the agents greater abilities and more resources and they become more future-oriented.
No, I have not agreed to that. I disagree with almost every part of it.
In particular, I think that the question of whether (and how much) one cares about the future is completely prior to questions about deciding how to act so as to maximize the things one cares about. In fact, I thought you were emphatically making exactly this point on another branch.
But that is fundamental ‘indifference’ (which I thought we had agreed cannot flow from instrumental considerations). I suppose you must be talking about some kind of instrumental or ‘derived’ indifference. But I still disagree. One does not derive indifference from not bothering to predict—one instead derives not bothering to predict from being indifferent.
Furthermore, I don’t respond to expected computronium shortages by truncating my computations. Instead, I switch to an algorithm which produces less accurate computations at lower computronium costs.
And finally, regarding classification, you seem to suggest that you view truncation of the future as just one form of discounting, whereas I choose not to. And that this makes our disagreement a quibble over semantics. To which I can only reply: Please go away Tim.
I think you would reduce how far you look forward if you were interested in using your resources intelligently and efficiently.
If you only have a million cycles per second, you can’t realistically go 150 ply deep into your go game—no matter how much you care about the results after 150 moves. You compromise—limiting both depth and breadth. The reduction in depth inevitably means that you don’t look so far into the future.
A lot of our communication difficulty arises from using different models to guide our intuitions. You keep imagining game-tree evaluation in a game with perfect information (like chess or go). Yes, I understand your point that in this kind of problem, resource shortages are the only cause of uncertainty—that given infinite resources, there is no uncertainty.
I keep imagining problems in which probability is built in, like the coffee-shop-go-player which I sketched recently. In the basic problem, there is no difficulty in computing expected utilities deeper into the future—you solve analytically and then plug in whatever value for t that you want. Even in the more difficult case (with the microstrokes) you can probably come up with an analytic solution. My models just don’t have the property that uncertainty about the future arises from difficulty of computation.
Right. The real world surely contains problems of both sorts. If you have a problem which is dominated by chaos based on quantum events then more resources won’t help. Whereas with many other types of problems more resources do help.
I recognise the existence of problems where more resources don’t help—I figure you probably recognise that there are problems where more resources do help—e.g. the ones we want intelligent machines to help us with.
Perhaps the real world does. But decision theory doesn’t. The conventional assumption is that a rational agent is logically omniscient. And generalizing decision theory by relaxing that assumption looks like it will be a very difficult problem.
The most charitable interpretation I can make of your argument here is that human agents, being resource limited, imagine that they discount the future. That discounting is a heuristic introduced by evolution to compensate for those resource limitations. I also charitably assume that you are under the misapprehension that if I only understood the argument, I would agree with it. Because if you really realized that I have already heard you, you would stop repeating yourself.
That you will begin listening to my claim that not all discounting is instrumental is more than I can hope for, since you seem to think that my claim is refuted each time you provide an example of what you imagine to be a kind of discounting that can be interpreted as instrumental.
I repeat, Tim. Please go elsewhere.
I am pretty sure that I just told you that I do not think that all discounting is instrumental. Here’s what I said:
Agents can have many kinds of utility function! That is partly a consequence of there being so many different ways for agents to go wrong.
Thx for the correction. It appears I need to strengthen my claim.
Not all discounting by rational, moral agents is instrumental.
Are we back in disagreement now? :)
No, we aren’t. In my book:
Being rational isn’t about your values, you can rationally pursue practially any goal. Epistemic rationality is a bit different—but I mosly ignore that as being unbiological.
Being moral isn’t really much of a constraint at all. Morality—and right and wrong—are normally with respect to a moral system—and unless a moral system is clearly specified, you can often argue all day about what is moral and what isn’t. Maybe some types of morality are more common than others—due to being favoured by the universe, or something like that—but any such context would need to be made plain in the discussion.
So, it seems (relatively) easy to make a temporal discounting agent that really values the present over the future—just stick a term for that in its ultimate values.
Are there any animals with ultimate temporal discounting? That is tricky, but it isn’t difficult to imagine natural selection hacking together animals that way. So: probably, yes.
Do I use ultimate temporal discounting? Not noticably—as far as I can tell. I care about the present more than the future, but my temporal discounting all looks instrumental to me. I don’t go in much for thinking about saving distant galaxies, though! I hope that further clarifies.
I should probably review around about now. Instead of that: IIRC, you want to wire temporal discounting into machines, so their preferences better match your own—whereas I tend to think that would be giving them your own nasty hangover.
If you are not valuing my responses, I recommend you stop replying to them—thereby ending the discussion.
Programs make good models. If you can program it, you have a model of it. We can actually program agents that make resource-limited decisions. Having an actual program that makes decisions is a pretty good way of modeling making resource-limited decisions.
Perhaps we have some kind of underlying disagreement about what it means for temporal discounting to be “instrumental”.
In your example of an agent with suffering from risk of death, my thinking is: this player might opt for a safer life—with reduced risk. Or they might choose to lead a more interesting but more risky life. Their degree of discounting may well adjust itself accordingly—and if so, I would take that as evidence that their discounting was not really part of their pure preferences, but rather was an instrumental and dynamic response to the observed risk of dying.
If—on the other hand—they adjusted the risk level of their lifestyle, and their level of temporal discounting remained unchanged, that would be cofirming evidence in favour of the hypothesis that their temporal discounting was an innate part of their ultimate preferences—and not instrumental.
This bothers me since, with reasonable assumptions, all rational agents engage in the same amount of catastrophe discounting.
That is, observed discount rate = instrumental discount rate + chance of death + other factors
We should expect everyone’s discount rate to change, by the same amount, unless they’re irrational.
Agents do not all face the same risks, though.
Sure, they may discount the same amount if they do face the same risks, but often they don’t—e.g. compare the motorcycle racer with the nun.
So: the discounting rate is not fixed at so-much per year, but rather is a function of the agent’s observed state and capabilities.
Of course. My point is that observing if the discount rate changes with the risk tells you if the agent is rational or irrational, not if the discount rate is all instrumental or partially terminal.
Stepping back for a moment, terminal values represent what the agent really wants, and instrumental values are things sought en-route.
The idea I was trying to express was: if what an agent really wants is not temporally discounted, then instrumental temporal discounting will produce a predictable temporal discounting curve—caused by aging, mortality risk, uncertainty, etc.
Deviations from that curve would indicate the presence of terminal temporal discounting.
Agreed.
I have no disagreement at all with your analysis here. This is not fundamental discounting. And if you have decision alternatives which affect the chances of dying, then it doesn’t even work to model it as if it were fundamental.
You recently mentioned the possibility of dying in the interim. There’s also the possibility of aging in the interim. Such factors can affect utility calculations.
For example: I would much rather have my grandmother’s inheritance now than years down the line, when she finally falls over one last time—because I am younger and fitter now.
Significant temporal discounting makes sense sometimes—for example, if there is a substantial chance of extinction per unit time. I do think a lot of discounting is instrumental, though—rather than being a reflection of ultimate values—due to things like the future being expensive to predict and hard to influence.
My brain spends more time thinking about tomorrow than about this time next year—because I am more confident about what is going on tomorrow, and am better placed to influence it by developing cached actions, etc. Next year will be important too—but there will be a day before to allow me to prepare for it closer to the time, when I am better placed to do so. The difference is not because I will be older then—or because I might die in the mean time. It is due to instrumental factors.
Of course one reason this is of interest is because we want to know what values to program into a superintelligence. That superintelligence will probably not age—and will stand a relatively low chance of extinction per unit time. I figure its ultimate utility function should have very little temporal discounting.
The problem with wiring discount functions into the agent’s ultimate utility function is that that is what you want it to preserve as it self improves. Much discounting is actually due to resource limitation issues. It makes sense for such discounting to be dynamically reduced as more resources become cheaply available. It doesn’t make much sense to wire-in short-sightedness.
I don’t mind tree-pruning algorithms attempting to normalise partial evaluations at different times—so they are more directly comparable to each other. The process should not get too expensive, though—the point of tree pruning is that it is an economy measure.
I suspect you want to replace “feel like they have given” with “give.”
Unless you are actually claiming that what is immoral is to make people fail to feel consulted, rather than to fail to consult them, which doesn’t sound like what you’re saying.
I think I will go with a simple tense change: “feel that they are giving”. Assent is far more important in the lead-up to the Singularity than during the aftermath.
Although I used the language “morally wrong”, my reason for that was mostly to make the rhetorical construction parallel. My preference for an open, inclusive process is a strong preference, but it is really more political/practical than moral/idealistic. One ought to allow the horses to approach the trough of political participation, if only to avoid being trampled, but one is not morally required to teach them how to drink.
Ah, I see. Sure, if you don’t mean morally wrong but rather politically impractical, then I withdraw my suggestion… I entirely misunderstood your point.
No, I did originally say (and mostly mean) “morally” rather than “politically”. And I should thank you for inducing me to climb down from that high horse.
I submit that I have many of the same misconceptions that Eliezer does; he changed his mind about one of the few places I disagree with him. That makes it far more of a change than it would be for you (one out of eight is a small portion, one out of a thousand is an invisible fraction).
Good point. And since ‘scary’ is very much a subjective judgment, that mean that I can’t validly criticize you for being foolish unless I have some way of arguing that yours and Eliezer’s positions in the realm of meta-ethics are misconceptions—something I don’t claim to be able to do.
So, if I wish my criticisms to be objective, I need to modify them. Eliezer’s expressed positions on meta-ethics (particularly his apparent acceptance of act-utilitarianism and his unwillingness to discount future utilities) together with some of his beliefs regarding the future (particularly his belief in the likelihood of a positive singularity and expansion of human population into the universe) make his ethical judgments completely unpredictable to many other people—unpredictable because the judgment may turn on subtle differences in the expect consequences of present day actions on people in the distant future. And, if one considers the moral judgments of another personal to be unpredictable, and that person is powerful, then one ought to consider that person scary. Eliezer is probably scary to many people.
True, but it has little bearing on whether Eliezer should be scary. That is, “Eliezer is scary to many people” is mostly a fact about many people, and mostly not a fact about Eliezer. The reverse of this (and what I base this distinction on) is that some politicians should be scary, and are not scary to many people.
I’m not sure the proposed modification helps: you seem to have expanded your criticism so far, in order to have them lead to the judgment you want to reach, that they cover too much.
I mean, sure, unpredictability is scarier (for a given level of power) than predictability. Agreed, But so what?
For example, my judgments will always be more unpredictable to people much stupider than I am than to people about as smart or smarter than I am. So the smarter I am, the scarier I am (again, given fixed power)… or, rather, the more people I am scary to… as long as I’m not actively devoting effort to alleviating those fears by, for example, publicly conforming to current fashions of thought. Agreed.
But what follows from that? That I should be less smart? That I should conform more? That I actually represent a danger to more people? I can’t see why I should believe any of those things.
You started out talking about what makes one dangerous; you have ended up talking about what makes people scared of one whether one is dangerous or not. They aren’t equivalent.
Well, I hope I haven’t done that.
Well, I certainly did that. I was trying to address the question more objectively, but it seems I failed. Let me try again from a more subjective, personal position.
If you and I share the same consequentialist values, but I know that you are more intelligent, I may well consider you unpredictable, but I won’t consider you dangerous. I will be confident that your judgments, in pursuit of our shared values, will be at least as good as my own. Your actions may surprise me, but I will usually be pleasantly surprised.
If you and I are of the same intelligence, but we have different consequentialist values (both being egoists, with disjoint egos, for example) then we can expect to disagree on many actions. Expecting the disagreement, we can defend ourselves, or even bargain our way to a Nash bargaining solution in which (to the extent that we can enforce our bargain) we can predict each others behavior to be that promoting compromise consequences.
If, in addition to different values, we also have different beliefs, then bargaining is still possible, though we cannot expect to reach a Pareto optimal bargain. But the more our beliefs diverge, regarding consequences that concern us, the less good our bargains can be. In the limit, when the things that matter to us are particularly difficult to predict, and when we each have no idea what the other agent is predicting, bargaining simply becomes ineffective.
Eliezer has expressed his acceptance of the moral significance of the utility functions of people in the far distant future. Since he believes that those people outnumber us folk in the present, that seems to suggest that he would be willing to sacrifice the current utility of us in favor of the future utility of them. (For example, the positive value of saving a starving child today does not outweigh the negative consequences on the multitudes of the future of delaying the Singularity by one day).
I, on the other hand, systematically discount the future. That, by itself, does not make Eliezer dangerous to me. We could strike a Nash bargain, after all. However, we inevitably also have different beliefs about consequences, and the divergence between our beliefs becomes greater the farther into the future we look. And consequences in the distant future are essentially all that matters to people like Eliezer—the present fades into insignificance by contrast. But, to people like me, the present and near future are essentially all that matter—the distant future discounts into insignificance.
So, Eliezer and I care about different things. Eliezer has some ability to predict my actions because he knows I care about short-term consequences and he knows something about how I predict short-term consequences. But I have little ability to predict Eliezer’s actions, because I know he cares primarily about long term consequences, and they are inherently much more unpredictable. I really have very little justification for modeling Eliezer (and any other act utilitarian who refuses to discount the future) as a rational agent.
I wish you would just pretend that they care about things a million times further into the future than you do.
The reason is that there are instrumental reasons to discount—the future disappears into a fog of uncertainty—and you can’t make decisions based on the value of things you can’t forsee.
The instrumental reasons fairly quickly dominate as you look further out—even when you don’t discount in your values. Reading your post, it seems as though you don’t “get” this, or don’t agree with it—or something.
Yes, the far-future is unpredictable—but in decision theory, that tends to make it a uniform grey—not an unpredictable black and white strobing pattern.
I don’t need to pretend. Modulo some mathematical details, it is the simple truth. And I don’t think there is anything irrational about having such preferences. It is just that, since I cannot tell whether or not what I do will make such people happy, I have no motive to pay any attention to their preferences.
Yet, it seems that the people who care about the future do not agree with you on that. Bostrom, Yudkowsky, Nesov, et al. frequently invoke assessments of far-future consequences (sometimes in distant galaxies) in justifying their recommendations.
We have crossed wires here. What I meant is that I wish you would stop protesting about infinite utilities—and how non-discounters are not really even rational agents—and just model them as ordinary agents who discount a lot less than you do.
Objections about infinity strike me as irrelevant and uninteresting.
Is that your true objection? I expect you can figure out what would make these people happy fairly easily enough most of the time—e.g. by asking them.
Indeed. That is partly poetry, though (big numbers make things seem important) - and partly because they think that the far future will be highly contingent on near future events.
The thing they are actually interested in influencing is mostly only a decade or so out. It does seem quite important—significant enough to reach back to us here anyway.
If what you are trying to understand is far enough away to be difficult to predict, and very important, then that might cause some oscillations. That is hardly a common situation, though.
Most of the time, organisms act as though want to become ancestors. To do that, the best thing they can do is focus on having some grandkids. Expanding their circle of care out a few generations usually makes precious little difference to their actions. The far future is unforseen, and usually can’t be directly influenced. It is usually not too relevant. Usually, you leave it to your kids to deal with.
That is a valid point. So, I am justified in treating them as rational agents to the extent that I can engage in trade with them. I just can’t enter into a long-term Nash bargain with them in which we jointly pledge to maximize some linear combination of our two utility functions in an unsupervised fashion. They can’t trust me to do what they want, and I can’t trust them to judge their own utility as bounded.
I think this is back to the point about infinities. The one I wish you would stop bringing up—and instead treat these folk as though they are discounting only a teeny, tiny bit.
Frankly, I generally find it hard to take these utilitarian types seriously in the first place. A “signalling” theory (holier-than-thou) explains the unusually high prevalance of utilitarianism among moral philosophers—and an “exploitation” theory explains its prevalance among those running charitable causes (utilitarianism-says-give-us-your-money). Those explanations do a good job of modelling the facts about utilitarianism—and are normally a lot more credible than the supplied justifications—IMHO.
Which suggests that we are failing to communicate. I am not surprised.
I do that! And I still discover that their utility functions are dominated by huge positive and negative utilities in the distant future, while mine are dominated by modest positive and negative utilities in the near future. They are still wrong even if they fudge it so that their math works.
I went from your “I can’t trust them to judge their own utility as bounded” to your earlier “infinity” point. Possibly I am not trying very hard here, though...
My main issue was you apparently thinking that you couldn’t predict their desires in order to find mutually beneficial trades. I’m not really sure if this business about not being able to agree to maximise some shared function is a big deal for you.
Mm. OK, so you are talking about scaring sufficiently intelligent rationalists, not scaring the general public. Fair enough.
What you say makes sense as far as it goes, assuming some mechanism for reliable judgments about people’s actual bases for their decisions. (For example, believing their self-reports.)
But it seems the question that should concern you is not whether Eliezer bases his decisions on predictable things, but rather whether Eliezer’s decisions are themselves predictable.
Put a different way: by your own account, the actual long-term consequences don’t correlate reliably with Eliezer’s expectations about them… that’s what it means for those consequences to be inherently unpredictable. And his decisions are based on his expectations, of course, not on the actual future consequences. So it seems to follow that once you know Eliezer’s beliefs about the future, whether those beliefs are right or wrong is irrelevant to you: that just affects what actually happens in the future, which you systematically discount anyway.
So if Eliezer is consistent in his beliefs about the future, and his decisions are consistently grounded in those beliefs, I’m not sure what makes him any less predictable to me than you are.
Of course, his expectations might not be consistent. Or they might be consistent but beyond your ability to predict. Or his decisions might be more arbitrary than you suggest here. For that matter, he might be lying outright. I’m not saying you should necessarily trust him, or anyone else.
But those same concerns apply to everybody, whatever their professed value structure. I would say the same things about myself.
But Eliezer’s beliefs about the future continue to change—as he gains new information and completes new deductions. And there is no way that he can practically keep me informed of his beliefs—neither he nor I would be willing to invest the time required for that communication. But Eliezer’s beliefs about the future impact his actions in the present, and those actions have consequences both in the near and distant future. From my point of view, therefore, his actions have essentially random effects on the only thing that matters to me—the near future.
Absolutely. But who isn’t that true of? At least Eliezer has extensively documented his putative beliefs at various points in time, which gives you some data points to extrapolate from.
I have no complaints regarding the amount of information about Eliezer’s beliefs that I have access to. My complaint is that Eliezer, and his fellow non-discounting act utilitarians, are morally driven by the huge differences in utility which they see as arising from events in the distant future—events which I consider morally irrelevant because I discount the future. No realistic amount of information about beliefs can alleviate this problem. The only fix is for them to start discounting. (I would have added “or for me to stop discounting” except that I still don’t know how to handle the infinities.)
Given that they predominantly care about things I don’t care about, and that I predominantly care about things they don’t worry about, we can only consider each other to be moral monsters.
You and I seem to be talking past each other now. It may be time to shut this conversation down.
Ethical egoists are surely used to this situation, though. The world is full of people who care about extremely different things from one another.
Yes. And if they both mostly care about modest-sized predictable things, then they can do some rational bargaining. Trouble arises when one or more of them has exquisitely fragile values—when they believe that switching a donation from one charity to another destroys galaxies.
I expect your decision algorithm will find a way to deal with people who won’t negotiate on some topics—or who behave in manner you have a hard time predicting. Some trouble for you, maybe—but probably not THE END OF THE WORLD.
Looking at the last 10 years, there seems to be some highly-predictable fund raising activity, and a lot of philosophising about the importance of machine morality.
I see some significant patterns there. It is not remotely like a stream of random events. So: what gives?
Sure, the question of whether a superintelligence will construct a superior morality to that which natural selection and cultural evolution have constructed on Earth is in some sense a narrow technical question. (The related question of whether the phrase “superior morality” even means anything is, also.)
But it’s a technical question that pertains pretty directly to the question of whose side one envisions oneself on.
That is, if one answers “yes,” it can make sense to ally with the Singularity rather than humanity (assuming that even means anything) as EY-1998 claims to, and still expect some unspecified good (or perhaps Good) result. Whereas if one answers “no,” or if one rejects the very idea that there’s such a thing as a superior morality, that justification for alliance goes away.
That said, I basically agree with you, though perhaps for different reasons than yours.
That is, even after embracing the idea that no other values, even those held by a superintelligence, can be superior to human values, one is still left with the same choice of alliances. Instead of “side with humanity vs. the Singularity,” the question involves a much narrower subset: “side with humanity vs. FAI-induced Singularity,” but from our perspective it’s a choice among infinities.
Of course, advocates of FAI-induced Singularity will find themselves saying that there is no conflict, really, because an FAI-induced Singularity will express by definition what’s actually important about humanity. (Though, of course, there’s no guarantee that individual humans won’t all be completely horrified by the prospect.)
Recall that after CEV extrapolates current humans’ volitions and construes a coherent superposition, the next step isn’t “do everything that superposition says”, but rather, “ask that superposition the one question ‘Given the world as it is right now, what program should we run next?’, run that program, and then shut down”. I suppose it’s possible that our CEV will produce an AI that immediately does something we find horrifying, but I think our future selves are nicer than that… or could be nicer than that, if extrapolated the right way, so I’d consider it a failure of Friendliness if we get a “do something we’d currently find horrifying for the greater good” AI if a different extrapolation strategy would have resulted in something like a “start with the most agreeable and urgent stuff, and other than that, protect us while we grow up and give us help where we need it” AI.
I really doubt that we’d need an AI to do anything immediately horrifying to the human species in order to allow it to grow up into an awesome fun posthuman civilization, so if CEV 1.0 Beta 1 appeared to be going in that direction, that would probably be considered a bug and fixed.
(shrug) Sure, if you’re right that the “most urgent and agreeable stuff” doesn’t happen to press a significant number of people’s emotional buttons, then it follows that not many people’s emotional buttons will be pressed.
But there’s a big difference between assuming that this will be the case, and considering it a bug if it isn’t.
Either I trust the process we build more than I trust my personal judgments, or I don’t.
If I don’t, then why go through this whole rigamarole in the first place? I should prefer to implement my personal judgments. (Of course, I may not have the power to do so, and prefer to join more powerful coalitions whose judgments are close-enough to mine. But in that case CEV becomes a mere political compromise among the powerful.)
If I do, then it’s not clear to me that “fixing the bug” is a good idea.
That is, OK, suppose we write a seed AI intended to work out humanity’s collective CEV, work out some next-step goals based on that CEV and an understanding of likely consequences, construct a program P to implement those goals, run P, and quit.
Suppose that I am personally horrified by the results of running P. Ought I choose to abort P? Or ought I say to myself “Oh, how interesting: my near-mode emotional reactions to the implications of what humanity really wants are extremely negative. Still, most everybody else seems OK with it. OK, fine: this is not going to be a pleasant transition period for me, but my best guess is still that it will ultimately be for the best.”
Is there some number of people such that if more than that many people are horrified by the results, we ought to choose to abort P?
Does the question even matter? The process as you’ve described it doesn’t include an abort mechanism; whichever choice we make P is executed.
Ought we include such an abort mechanism? It’s not at all clear to me that we should. I can get on a roller-coaster or choose not to get on it, but giving me a brake pedal on a roller coaster is kind of ridiculous.
It’s partly a chance vs necessity question.
It is partly a question about whether technological determinism is widespread.
Apparently he changed his mind about a bunch of things.
On what appears to be their current plan, the SIAI, don’t currently look very dangerous, IMHO.
Eray Ozkural recently complained: “I am also worried that backwards people and extremists will threaten us, and try to dissuade us from accomplishing our work, due to your scare tactics.”
I suppose that sort of thing is possible—but my guess is that they are mostly harmless.
Or so you hope.
Yes, I agree. I don’t really believe that he only learnt how to disguise his true goals. But I’m curious if you would be satisfied with his word alone if he would be able to run a fooming AI next week only if you gave your OK?
He has; this is made abundantly clear in the Metaethics sequence and particularly the “coming of age” sequence. That passage appears to be a reflection of the big embarrassing mistake he talked about, when he thought that he knew nothing about true morality (se “Could Anything Be Right?”) and that a superintelligence with a sufficiently “unconstrained” goal system (or what he’d currently refer to as “a rock”) would necessarily discover the ultimate true morality, so that whatever this superintelligence ended up doing would necessarily be the right thing, whether that turned out to consist of giving everyone a volcano lair full of catgirls/boys or wiping out humanity and reshaping the galaxy for its own purposes.
Needless to say, that is not his view anymore; there isn’t even any “Us or Them” to speak of anymore. Friendly AIs aren’t (necessarily) people, and certainly won’t be a distinct race of people with their own goals and ambitions.
Yes, I’m not suggesting that he is just signaling all that he wrote in the sequences to persuade people to trust him. I’m just saying that when you consider what people are doing for much less than shaping the whole universe to their liking, one might consider some sort of public or third-party examination before anyone is allowed to launch a fooming AI.
The hard part there is determining who’s qualified to perform that examination.
It will probably never come to it anyway. Not because the SIAI is not going to succeed but if it told anyone that it is even close to implementing something like CEV then the whole might of the world would crush it (if the world didn’t turn rational until then). Because to say that you are going to run a fooming AI will be interpreted as trying to take over all power and rule the universe. I suppose this is also the most likely reason for the SIAI to fail. The idea is out and once people notice that fooming AI isn’t just science fiction they will do everything to stop anyone from either implementing one at all or to run their own before anyone else does. And who’ll be the first competitor to take out in the race to take over the universe? The SIAI of course, just search Google. I guess it would have been a better idea to make this a stealth project from day one. But that train has left.
Anyway, if the SIAI does succeed one can only hope that Yudkowsky is not Dr. Evil in disguise. But even that would still be better than a paperclip maximizer. I assign more utility to a universe adjusted to Yudkowsky’s volition (or the SIAI) than paperclips (I suppose even if that means I’ll not “like” what happens to me then).
I don’t see who is going to enforce that. Probably nobody.
What we are fairly likely to see is open-source projects getting more limelight. It is hard to gather mindshare if your strategy is: trust the code to us. Relatively few programmers are likely to buy into such projects—unless you pay them to do so.
Yes on the question of humans vs Singularity.
(His word alone would not be enough to convince me he’s gotten the fooming AI friendly, though, so I would not give the OK for prudential reasons.)
So you take him at his word that he’s working in your best interest. You don’t think it is necessary to supervise the SIAI while working towards friendly AI. But once they finished their work, ready to go, you are in favor of some sort of examination before they can implement it. Is that correct?
I don’t think human selfishness vs. public interest is much of a problem with FAI; everyone’s interests with respect to FAI are well correlated, and making an FAI which specifically favors its creator doesn’t give enough extra benefit over an FAI which treats everyone equally to justify the risks (that the extra term will be discovered, or that the extra term introduces a bug). Not even for a purely selfish creator; FAI scenarios just doesn’t leave enough room for improvement to motivate implementing something else.
On the matter of inspecting AIs before launch, however, I’m conflicted. On one hand, the risk of bugs is very serious, and the only way to mitigate it is to have lots of qualified people look at it closely. On the other hand, if the knowledge that a powerful AI was close to completion became public, it would be subject to meddling by various entities that don’t understand what they’re doing. and it would also become a major target for espionage by groups of questionable motives and sanity who might create UFAIs. These risks are difficult to balance, but I think secrecy is the safer choice, and should be the default.
If your first paragraph turns out to be true, does that change anything with respect to the problem of human and political irrationality? My worry is that even if there is only one rational solution that everyone should favor, how likely is it that people understand and accept this? That might be no problem given the current perception. If the possibility of fooming AI will still be ignored at the point it will be possible to implement friendliness (CEV etc.), then there will be no opposition. So some quick quantum leaps towards AGI will likely allow the SIAI to follow through on it. But my worry is that if the general public or governments notice this possibility and take it serious, it will turn into a political mess never seen before. The world would have to be dramatically different for the big powers to agree on something like CEV. I still think this is the most likely failure mode in case the SIAI succeeds in defining friendliness before someone else runs a fooming AI. Politics.
I agree. But is that still possible? After all we’re writing about it in public. Although to my knowledge the SIAI never suggested that it would actually create a fooming AI, only come up with a way to guarantee its friendliness. But what you said in your second paragraph would suggest that the SIAI would also have to implement friendliness or otherwise people will take advantage of it or simply mess it up.
This?
http://www.acceleratingfuture.com/people-blog/?p=196
Probably it would be easier to run the examination during the SIAI’s work, rather than after. Certainly it would save more lives. So, supervise them, so that your examination is faster and more thorough. I am not in favour of pausing the project, once complete, to examine it if it’s possible to examine in in operation.
At the bottom—just after where he talks about his “transfer of allegiance”—it says:
©1998 by Eliezer S. Yudkowsky.
We can’t say he didn’t warn us ;-)
IMO, it is somewhat reminiscent of certain early Zuckerberg comments.
Eliezer1998 is almost as scary as Hanson2010 - and for similar reasons.
1998 you mean?
Yes. :)
What Zuckerberg comments are you referring to?
The IM ones where he says “trust me”.
Zuckerberg probably thought they were private, though. I added a link.
If you follow the link:
You shouldn’t seek to “weaken an argument”, you should seek what is the actual truth, and then maybe ways of communicating your understanding. (I believe that’s what you intended anyway, but think it’s better not to say it this way, as a protective measure against motivated cognition.)
I like your parenthetical, I often want to say something like this, and you’ve put it well.
Thank-you for pointing this out.
I took wedrifid’s point as being that whether EY is right or not, the bad effect described happens. This is part of the lose-lose nature of the original problem (what to do about a post that hurt people).
I don’t think this rhetoric is applicable. Several very intelligent posters have deemed the idea dangerous; a very intelligent you deems it safe. You argue they are wrong because it is ‘obviously safe’.
Eliezer is perfectly correct to point out that, on the whole of it, ‘obviously it is safe’ just does not seem like strong enough evidence when it’s up against a handful of intelligent posters who appear to have strong convictions.
Pardon? I don’t believe I’ve said any such thing here or elsewhere. I could of course be mistaken—I’ve said a lot of things and don’t recall them all perfectly. But it seems rather unlikely that I did make that claim because it isn’t what I believe.
This leads me to the conclusion that...
… This rhetoric isn’t applicable either. ;)
I should have known I wouldn’t get away with that, eh? I actually don’t know if you oppose the decision because you think the idea is safe, or because you think that censorship is wronger than the idea is dangerous, or whether you even oppose the decision at all and were merely pointing out appeals to authority. If you could fill me on the details, I could re-present the argument as it actually applies.
Thankyou, and yes I can see the point behind what you were actually trying to say. It just important to me that I am not misrepresented (even though you had no malicious intent).
There are obvious (well, at least theoretically deducible based on the kind of reasoning I tend to discuss or that used by harry!mor) reasons why it would be unwise to give a complete explanation of all my reasoning.
I will say that ‘censorship is wronger’ is definitely not the kind of thinking I would use. Indeed, I’ve given examples of things that I would definitely censor. Complete with LOTR satire if I recall. :)
This isn’t evidence about that hypothesis, it’s expected that most certainly nothing happens. Yet you write for rhetorical purposes as if it’s supposed to be evidence against the hypothesis. This constitutes either lying or confusion (I expect it’s unintentional lying, with phrases produced without conscious reflection about their meaning, so a little of both lying and confusion).
The sentence of Vaniver’s you quote seems like a straight forward case of responding to hyperbole with hyperbole in kind.
That won’t be as bad-intentioned, but still as wrong and deceptive.
The point we are trying to make is that we think the people who stared the basilisk in the eyes and metaphorically turned to stone are stronger evidence.
I get that. But I think it’s important to consider both positive and negative evidence- if someone’s testimony that they got turned to stone is important, so are the testimonies of people who didn’t get turned to stone.
The question to me is whether the basilisk turns people to stone or people turn themselves into stone. I prefer the second because it requires no magic powers on the part of the basilisk. It might be that some people turn to stone when they see goatse for the first time, but that tells you more about humans and how they respond to shock than about goatse.
Indeed, that makes it somewhat useful to know what sort of things shock other people. Calling this idea ‘dangerous’ instead of ’dangerous to EY” strikes me as mind projection.
I am considering both.
I generally find myself in support of people who advocate a policy of keeping people from seeing Goatse.
I’m not sure how to evaluate this statement. What do you mean by “keeping people from seeing Goatse”? Banning? Voluntarily choosing not to spread it? A filter like the one proposed in Australia that checks every request to the outside world?
Censoring posts that display Goatse on LessWrong.
Generally, censoring posts that display Goatse on non-Goatse websites.
I am much more sympathetic to “keeping goatse off of site X” than “keeping people from seeing goatse,” and so that’s a reasonable policy. If your site is about posting pictures of cute kittens, then goatse is not a picture of a cute kitten.
However, it seems to me that suspected Langford basilisks are part of the material of LessWrong. Imagine someone posted in the discussion “hey guys, I really want to be an atheist but I can’t stop worrying about whether or not the Rapture will happen, and if it does life will suck.” It seems to me that we would have a lot to say to them about how they could approach the situation more rationally.
And, if Langford basilisks exist, religion has found them. Someone got a nightmare because of Roko’s idea, but people fainted upon hearing Sinners in the Hands of an Angry God. Why are we not looking for the Perseus for this Medusa? If rationality is like an immune system, and we’re interested in refining our rationality, we ought to be looking for antibodies.
It seems to me that Eliezer’s response as moderator of LessWrong strongly implies that he does not believe this is the case. Your goal, then, would be to convince Eliezer that it ought to be part of the LessWrong syllabus, as it were. Cialdini’s Influence and other texts would probably advise you to work within his restrictions and conform to his desires as much as practical—on a site like LessWrong, though, I am not sure how applicable the advice would be, and in any case I don’t mean to be prescriptive about it.
Right. I see a few paths to do that that may work (and no, holding the future hostage is not one of them).
Is Goatse supposed to be a big deal? Someone showed it to me and I literally said “who cares?”
I totally agree. There are far more important internet requests that my (Australian) government should be trying to filter. Priorities people!
Yes.
I feel like reaction videos are biased towards people who have funny or dramatic reactions, but point taken.
I don’t understand this. (Play on conservation of expected evidence? In what way?)
Normal updating.
Original prior for basilik-danger.
Eliezer_Yudkowsky stares at basilisk, turns to stone (read: engages idea, decides to censor). Revise pr(basilisk-danger) upwards.
FormallyknownasRoko stares at basilisk, turns to stone (read: appears to truly wish e had never thought it). Revise pr(basilisk-danger) upwards.
Vladimir_Nesov stares at basilisk, turns to stone (read: engages idea, decides it is dangerous). Revise pr(basilisk-danger) upwards.
Vaniver stares at basilisk, is unharmed (read: engages idea, decides it is not dangerous). Revise pr(basilisk-danger) downwards.
Posterior is higher than original prior.
For the posterior to equal or lower than the prior, Vaniver would have to be more a rationalist than Eliezer, Roko, and you put together.
Okay, but more than four people have engaged with the idea. Should we take a poll?
The problem of course is that majorities often believe stupid things. That is why a free marketplace of ideas free from censorship is a really good thing! The obvious thing to do is exchange information until agreement but we can’t do that, at least not here.
Also, the people who think it should be censored all seem to disagree about how dangerous the idea really is, suggesting it isn’t clear how it is dangerous. It also seems plausible that some people have influenced the thinking of other people- for example it looks like Roko regretted posting after talking to Eliezer. While Roko’s regret is evidence that Eliezer is right, it isn’t the same as independent/blind confirmation that the idea is dangerous.
When you give all agents equal weight, sure. Without taking a poll of anything except my memory, Eliezer+Roko+VladNesov+Alicorn are against, DavidGerard+waitingforgodel+vaniver are for. Others are more sidelined than supporting a particular side.
Aumann agreement works in the case of hidden information—all you need are posteriors and common knowledge of the event alone.
Roko increased his estimation and Eliezer decreased his estimation—and the amounts they did so are balanced according to the strength of their private signals. Looking at two Aumann-agreed conclusions gives you the same evidence as looking that the pre-Aumann (differing) conclusions—the same way that 10, 10 gives you the same average as 5, 15.
I would prefer you not treat people avoiding a discussion as evidence that people don’t differentially evaluate the assertions made in that discussion.
Doing so creates a perverse incentive whereby chiming in to say “me too!” starts to feel like a valuable service, which would likely chase me off the site altogether. (Similar concerns apply to upvoting comments I agree with but don’t want to see more of.)
If you are seriously interested in data about how many people believe or disbelieve certain propositions, there exist techniques for gathering that data that are more reliable than speculating.
If you aren’t interested, you could just not bring it up.
I treat them as not having given me evidence either way. I honestly don’t know how I could treat them otherwise.
It is extremely hard to give no evidence by making a decision, even a decision to do nothing.
Okay. It is not that they give no evidence by remaining out of the discussion—it is that the evidence they give is spread equally over all possibilities. I don’t know enough about these people to say that discussion-abstainers are uniformly in support or in opposition to the idea. The best I can do is assume they are equally distributed between support and opposition, and not incorrectly constrain my anticipations.
You can do better than that along a number of different dimensions.
But even before getting there, it seems important to ask whether our unexpressed beliefs are relevant.
That is, if it turned out that instead of “equally distributed between support and opposition”, we are 70% on one side, or 90%, or 99%, or that there are third options with significant membership, would that information significantly affect your current confidence levels about what you believe?
If our unexpressed opinions aren’t relevant, you can just not talk about them at all, just like you don’t talk about millions of other things that you don’t know and don’t matter to you.
If they are relevant, one thing you could do is, y’know, research. That is, set up a poll clearly articulating the question and the answers that would affect your beliefs and let people vote for their preferred answers. That would be significantly better than assuming equal distribution.
Another thing you could do, if gathering data is unpalatable, is look at the differential characteristics of groups that express one opinion or another and try to estimate what percentage of the site shares which characteristics.
Yes. In the absence of actual evidence (which seems dangerous to gather in the case of this basilisk), I pretty much have to go by expressed opinions. To my mind, it was like trying to count the results of experiments that haven’t been performed yet.
I did not seek out more information because it was a throwaway line in an argument attempting to explain to people why it appears their voices are being ignored. I personally am on the side of censoring the idea, not having understood it at all when it first posted, and that may have bled into my posts (I should have exercised stronger control over that) but I am not arguing for censorship. I am arguing why, when someone says “it’s not dangerous!”, some people aren’t coming around to their perspective.
I don’t intend to argue for the censorship of the idea unless sorely pressed.
A few things:
** I’m confused. On the one hand, you say knowing the popularity of various positions is important to you in deciding your own beliefs about something potentially dangerous to you and others. On the other hand, you say it’s not worth seeking more information about and was just a throwaway line in an argument. I am having a hard time reconciling those two claims… you seem to be trying to have it both ways. I suspect I’ve misunderstood something important.
** I didn’t think you were arguing for censorship. Or against it. Actually, I have long since lost track of what most participants in this thread are arguing for, and in some cases I’m not sure they themselves know.
** I agree with you that the existence of knowledgeable people who think something is dangerous is evidence that it’s dangerous.
** Since it seems to matter: for my own part, I rate the expected dangerousness of “the basilisk” very low, and the social cost to the group of the dispute over “censoring” it significantly higher but still low.
** I cannot see why that should be of any evidentiary value whatsoever, to you or anyone else. Whether I’m right or wrong, my position is a pretty easy-to-reach one; it’s the one you arrive at in the absence of other salient beliefs (like, for example, the belief that EY/SIAI is a highly reliable estimator of potential harm done by “basilisks” in general, or the belief that the specific argument for the harmfulness of this basilisk is compelling). And most newcomers will lack those other beliefs. So I expect that quite a few people share my position—far more than 50% -- but I can’t see why you ought to find that fact compelling. That a belief is very widely shared among many many people like me who don’t know much about the topic isn’t much evidence for anything.
Sometimes that isn’t a bad state to be in. Not having an agenda to serve frees up the mind somewhat! :)
(nods) I’m a great believer in it. Especially in cases where a disagreement has picked up momentum, and recognizable factions have started forming… for example, if people start suggesting that those who side with the other team should leave the group. My confidence in my ability to evaluate an argument honestly goes up when I genuinely don’t know what team that argument is playing for.
I suspect I’ve obfuscated it, actually. The popularity of various positions is not intrinsically important to me—in fact, I give professions of believe about as little credit as I can get away with. This specific case is such that every form of evidence I find stronger (reasoning through the argument logically for flaws; statistical evidence about its danger) is not available. With a dearth of stronger evidence, I have to rely on weak evidence—but “the evidence is weak” is not an argument for privileging my own unsubstantiated position.
I don’t feel the need to collect weak evidence … I should, in this case. I was following a heuristic of not collecting weak evidence (waste of effort) without noticing that there was no stronger evidence.
Why are people’s beliefs of any value? Everyone has the ability to reason. All (non-perfect) reasoners fail in some way or another; if I look at many (controlling for biased reasoning) it gives me more of a chance to spot the biases—I have a control to compare it to.
This case is a special case; some people do have evidence. They’ve read the basilisk, applied their reasoning and logic, and deduced that it is / is not dangerous. These peoples’ beliefs are to be privileged over people who have not read the basilisk. I can’t access private signals like that—I don’t want to read a potential basilisk. So I make a guess at how strong their private signal is (this is why I care about their rationality) and use that as weak evidence for or against.
If seeking harder evidence wasn’t dangerous (and it usually isn’t) I would have done that instead.
The sentence I quoted sounded to me as though you were treating those of us who’ve remained “sidelined” as evidence of something. But if you were instead just bringing us up as an example of something that provides no evidence of anything, and if that was clear to everyone else, then I’m content.
I think I had a weird concept of what ‘sidelined’ meant in my head when I was writing that. Certainly, it seems out of place to me now.
I’m for. I believe Tim Tyler is for.
Human’s have this unfortunate feature of not being logically omniscient. In such cases where people don’t see all the logical implications of an argument we can treat those implications as hidden information. If this wasn’t the case then the censorship would be totally unnecessary as Roko’s argument didn’t actually include new information. We would have all turned to stone already.
There is no way for you to have accurately assessed this. Roko and Eliezer aren’t idealized Bayesian agents, it is extremely unlikely they performed a perfect Aumann agreement. If one is more persuasive than the other for reasons other than the evidence they share than their combined support for the proposition may not be worth the same as two people who independently came to support the proposition. Besides which, according to you, what information did they share exactly?
I had a private email conversation with Eliezer that did involve a process of logical discourse, and another with Carl.
Also, when I posted the material, I hadn’t thought it through. One I had thought it through, I realized that I had accidentally said more than I should have done.
David_Gerard, Jack, timtyler, waitingforgodel, and Vaniver do not currently outweigh Eliezer_Yudkowsky, FormallyknownasRoko, Vladimir_Nesov, and Alicorn, as of now, in my mind.
It does not need to be a perfect Aumann agreement; a merely good one will still reduce the chances of overcounting or undercounting either side’s evidence well below the acceptable limits.
They are approximations of Bayesian agents, and it is extremely likely they performed an approximate Aumann agreement.
To settle this particular question, however, I will pay money. I promise to donate 50 dollars to the Singularity Institute for Artificial Intelligence, independent of other plans to donate, if Eliezer confirms that he did revise his estimate down; or if he confirms that he did not revise his estimate down. Payable within two weeks of Eliezer’s comment.
I’m curious: if he confirms instead that the change in his estimate, if there was one, was small enough relative to his estimate that he can’t reliably detect it or detect its absence, although he infers that he updated using more or less the same reasoning you use above, will you donate or not?
I will donate.
I would donate even if he said that he revised his estimate upwards.
I would then seriously reconsider my evaluation of him, but as it stands the offer is for him to weigh in at all, not weigh in on my side.
edit: I misparsed your comment. That particular answer would dance very close to ‘no comment’, but unless it seemed constructed that way on purpose, I would still donate.
Yeah, that’s fair. One of the things I was curious about was, in fact, whether you would take that answer as a hedge, but “it depends” is a perfectly legitimate answer to that question.
How many of me would there have to be for that to work?
Also, why is rationalism the risk factor for this basilisk? Maybe the basilisk only turns to stone people with brown eyes (or the appropriate mental analog).
Only one; I meant ‘you’ in that line to refer to Vlad. It does raise the question “how many people disagree before I side with them instead of Eliezer/Roko/Vlad”. And the answer to that is … complicated. Each person’s rationality, modified by how much it was applied in this particular case, is the weight I give to their evidence; then the full calculation of evidence for and against should bring my prior to within epsilon but preferably below my original prior for me to decide the idea is safe.
Rationalism is the ability to think well and this is a dangerous idea. If it were a dangerous bacterium then immune system would be the risk factor.
Generally, if your immune system is fighting something, you’re already sick. Most pathogens are benign or don’t have the keys to your locks. This might be a similar situation- the idea is only troubling if your lock fits it- and it seems like then there would be rational methods to erode that fear (like the immune system mobs an infection).
The analogy definitely breaks down, doesn’t it? What I had in mind was Eliezer, Roko, and Vlad saying “I got sick from this infection” and you saying “I did not get sick from this infection”—I would look at how strong each person’s immune system is.
So if Eliezer, Roko, and Vlad all had weak immune systems and yours was quite robust, I would conclude that the bacterium in question is not particularly virulent. But if three robust immune systems all fell sick, and one robust immune system did not, I would be forced to decide between some hypotheses:
the first three are actually weak immune systems
the fourth was not properly exposed to the bacterium
the fourth has a condition that makes it immune
the bacterium is not virulent, the first three got unlucky
On the evidence I have, the middle two seem more likely than the first and last hypotheses.
I agree- my money is on #3 (but I’m not sure whether I would structure is as “fourth is immune” or “first three are vulnerable”- both are correct, but which is more natural word to use depends on the demographic response).
Er, are you describing rationalism (I note you say that and not “rationality”) as susceptible to autoimmune disorders? More so than in this post?
This equivocates the intended meaning of turning to stone in the original discussion you replied to. Fail. (But I understand what you meant now.)
Sorry, I should not have included censoring specifically. Change the “read:”s to ‘engages, reacts negatively’, ‘engages, does not react negatively’ and the argument still functions.
The argument does seem to function, but you shouldn’t have used the term in a sense conflicting with intended.
You would need a mechanism for actually encouraging them to “overcome” the terror, rather than reinforce it. Otherwise you might find that your subjects are less rational after this process than they were before.
Right- and current methodologies when it comes to that sort of therapy are better done in person than over the internet.
one wonders how something like that might have evolved, doesn’t one? What happened to all the humans who came with the mutation that made them want to find out whether the sabre-toothed tiger was friendly?
I don’t see how very unlikely events that people knew the probability of would have been part of the evolutionary environment at all.
In fact, I would posit that the bias is most likely due to having a very high floor for probability. In the evolutionary environment things with probability you knew to be <1% would be unlikely to ever be brought to your attention. So not having any good method for intuitively handling probabilities between 1% and zero would be expected.
In fact, I don’t think I have an innate handle on probability to any finer grain than ~10% increments. Anything more than that seems to require mathematical thought.
Probably less than 1% of cave-men died by actively seeking out the sabre-toothed tiger to see if it was friendly. But I digress.
But probably far more than 1% of cave-men who chose to seek out a sabre-tooth tiger to see if they were friendly died due to doing so.
The relevant question on an issue of personal safety isn’t “What % of the population die due to trying this?”
The relevant question is: “What % of the people who try this will die?”
In the first case, rollerskating downhill, while on fire, after having taken arsenic would seem safe (as I suspect no-one has ever done precisely that)
No, really, one doesn’t wonder. It’s pretty obvious. But if we’ve gotten to the point where “this bias paid off in the evolutionary environment!” is actually used as an argument, then we are off the rails of refining human rationality.
What’s wrong with using “this bias paid off in the evolutionary environment!” as an argument? I think people who paid more attention to this might make fewer mistakes, especially in domains where there isn’t a systematic, exploitable difference between EEA and now.
The evolutionary environment contained enetities capable of dishing out severe punishments, unertainty, etc.
If anything, I think that the heuristic that an idea “obviously” can’t be dangerous is the problem, not the heuristic that one should take care around possibilities of strong penalites.
It is a fine argument for explaining the widespread occcurrence of fear. However, today humans are in an environment where their primitive paranoia is frequently triggered by inappropriate stimulii.
Dan Gardener goes into this in some detail in his book: Risk: The Science and Politics of Fear
Video of Dan discussing the topic: Author Daniel Gardner says Americans are the healthiest and safest humans in the world, but are irrationally plagued by fear. He talks with Maggie Rodriguez about his book ‘The Science Of Fear.’
He says “we” are the healthiest and safest humans ever to live, but I’m very skeptical that this refers specifically to Americans rather than present day first world nation citizens in general.
Yes, we are, in fact, safer than in the EEA, in contemporary USA.
But still, there are some real places where danger is real, like the Bronx or scientology or organized crime or a walking across a freeway. So, don’t go rubbishing the heuristic of being frightened of potentially real danger.
I think it would only be legitimate to criticize fear itself on “outside view” grounds if we lived in a world with very little actual danger, which is not at all the case.
So, this may be a good way to approach the issue: loss to individual humans is, roughly speaking, finite. Thus, the correct approach to fear is to gauge risks by their chance of loss, and then discount if it’s not fatal.
So, we should be much less worried by a 1e-6 risk than a 1e-4 risk, and a 1e-4 risk than a 1e-2 risk. If you are more scared by a 1e-6 risk than a 1e-2 risk, you’re reasoning fallaciously.
Now, one might respond- “but wait! This 1e-6 risk is 1e5 times worse than the 1e-2 risk!”. But that seems to fall into the traps of visibility bias and privileging the hypothesis. If you’re considering a 1e-6 risk, have you worked out not just all the higher order risks, but also all of the lower order risks that might have higher order impact? And so when you have an idea like the one in question, which I would give a risk of 1e-20 for discussion’s sake, and you consider it without also bringing into your calculus essentially every other risk possible, you’re not doing it rigorously. And, of course, humans can’t do that computation.
Now, the kicker here is that we’re talking about fear. I might fear the loss of every person I know just as strongly as I fear the loss of every person that exists, but be willing to do more to prevent the loss of everyone that exists (because that loss is actually larger). Fear has psychological ramifications, not decision-theoretic ones. If this idea has 1e-20 chances of coming to pass, you can ignore it on a fear level, and if you aren’t, then I’m willing to consider that evidence you need help coping with fear.
I have a healthy respect for the adaptive aspects of fear. However, we do need an explanation for the scale and prevalence of irrational paranoia.
The picture of an ancestral water hole surrounded by predators helps us to understand the origins of the phenomenon. The ancestral environment was a dangerous and nasty place where people led short, brutish lives. There, living in constant fear made sense.
Someone’s been reading Terry Pratchett.
Considering the extraordinary appeal that forbidden knowledge has even for the average person, let alone the exceptionally intellectually curious, I don’t think this is a very effective way to warn a person off of seeking out the idea in question. Far from deserving what they get, such a person is behaving in a completely ordinary manner, to exceptionally severe consequence.
Personally, I don’t want to know about the idea (at least not if it’s impossible without causing myself significant psychological distress to no benefit,) but I’ve also put significant effort into training myself out of responses such as automatically clicking links to shock sites that say “Don’t click this link!”
Hmm. It is tricky to go back, I would imagine.
The material does come with some warnings, I believe. For instance, consider this one:
“Beware lest Friendliness eat your soul.”—Eliezer Yudkowsky
As I understand, you donate (and plan to in the future) to existential risk charities, and that is one of the consequences of you having come across that link. How does this compute into net negative, in your estimation, or are you answering a different question?
Sure I want to donate. But if you express it as a hypothetical choice between being a person who didn’t know about any of this and had no way of finding out, versus what I have now, I choose the former. Though since that is not an available choice, it is a somewhat academic question.
I can’t believe to hear this from a person who wrote about Ugh fields. I can’t believe to read a plead for ignorance on a blog devoted to refining rationality. Ignorance is bliss, is that the new motto now?
Well look, one has to do cost/benefit calculations, not just blindly surge forward in some kind of post-enlightenment fervor. To me, it seems like there is only one positive term in the equation:: the altrustic value of giving money to some existential risk charity.
All the other terms are negative, at least for me. And unless I actually overcome excuses, akrasia, etc to donate a lot, I think it’ll all have been a mutually detrimental waste of time.
There is only one final criterion, the human decision problem. It trumps any other rule, however good or useful.
(You appeal to particular heuristics, using the feeling of indignation as a rhetoric weapon.)
Not helping. I was referring to the the moral value of donations as an argument for choosing to know, as opposed to not knowing. You don’t seem to address that in your reply (did I miss something?).
Oh, I see. Well, I guess it depends upon how much I eventually donate and how much of an incremental difference that makes.
It would certainly be better to just donate, AND to also not know anything about anything dangerous. I’m not even sure that’s possible, though. For all we know, just knowing about any of this is enough to land you in a lot of trouble either in the causal future or elsewhere.
I gather doing so would irritate our site’s host and moderator.
I wasn’t “filled in”, and I don’t know whether my argument coincides with Eliezer’s. I also don’t understand why he won’t explain his argument, if it’s the same as mine, now that content is in the open (but it’s consistent with, that is responds to the same reasons as, continuing to remove comments pertaining to the topic of the post, which makes it less of a mystery).
But you think that it is not a good thing for this to propagate more?
As a decision on expected utility under logical uncertainty, but extremely low confidence, yes. I can argue that it most certainly won’t be a bad thing (which I even attempted in comments to the post itself, my bad), the expectation of it being a bad thing derives from remaining possibility of those arguments failing. As Carl said, “that estimate is unstable in the face of new info” (which refers to his own argument, not necessarily mine).
For everyone who wants to know what this discussion is all about, the forbidden idea, here is something that does not resemble it except for its stupefying conclusions:
There’s this guy who has the idea that it might be rational to rob banks to donate the money to charities. He tells a friend at a bank about it who freaks out and urges him to shut up about it. Unfortunately some people who also work at the local bank overheard the discussion and it gave them horrible nightmares. Since they think the idea makes sense they now believe that everyone will starve to death if they don’t rob banks and donate the money to charities that try to feed the world. The friend working at the bank now gets really upset and tells the dude with the idea about this. He argues that this shows how dangerous the idea is and that his colleagues and everyone else who’s told about the idea and who is working for a bank might just rob their own banks.
An inconvenient detail here, that makes the idea slightly less likely, is that it isn’t talking about the local bank in your town, or any bank on Earth at all, but one located in the system of Epsilon Eridani. And the friend and people with nightmares are not working for some bank but a charity concerned with space colonization. To conclude that the idea is dangerous, they don’t just have to accept its overall premise but also that there are banks over at Epsilon Eridani, that it is worth it to rob them, that one can build the necessary spaceships to reach that place and so on. In other words you have to be completely nuts and should seek help if you seriously believe that the idea is dangerous.
And what is the secondary problem?
The problem is not the idea itself but that there obviously are people crazy enough to take it serious and who might commit to crazy things due to their beliefs. The problem is that everyone is concerned with space colonization but that we don’t want to have it colonized by some freaks with pirate spaceships to rob alien banks because of some crazy idea.
P.S. Any inconsistency in the above story is intended to resemble the real idea.
I am familiar with the forbidden idea, and don’t think this analogy resembles it at all.
This analogy makes sense if you assume the conclusion that the argument for the post being a Basilisk is incorrect, but not as an argument for convincing people that it’s incorrect. To evaluate whether the argument is correct, you have to study the argument itself, there is no royal road (the conclusion can be studied in other ways, since particular proof can’t be demanded).
(See this summary of the structure of the argument.)
FWIW, loads of my comments were apparently deleted by administrators at the time.
I was away for a couple of months while the incident took place and when I returned I actually used your user page to reconstruct most of the missing conversation (with blanks filled from other user pages and an alternate source). Yours was particularly useful because of how prolific you were with quoting those to whom you were replying. I still have ten pages of your user comments stored on my harddrive somewhere. :)
Yes, some weren’t fully deleted, but—IIRC—others were. If I am remembering this right, the first deleted post (Roko’s) left comments behind in people’s profiles, but with the second deleted post the associated comments were rendered completely inaccessible to everyone. At the time, I figured that the management was getting better at nuking people’s posts.
After that—rather curiously—some of my subsequent “marked deleted” posts remained visible to me when logged in—so I wasn’t even aware of what had been “marked deleted” to everyone else for most of the time—unless I logged out of the site.
That could only apply to your original post, not subsequent stuff.
Right. Bottle. Genie.