Just noting that buried in the comments Will has stated that he thinks the probability that cryo will actually save your life is one in a million -- 10^-6 -- (with some confusion surrounding the technicalities of how to actually assign that and deal with structural uncertainty).
I think that we need to iron out a consensus probability before this discussion continues.
Edit: especially since if this probability is correct, then the post no longer makes sense…
Correction: not ‘you’, me specifically. I’m young, phyisically and psychologically healthy, and rarely find myself in situations where my life is in danger (the most obvious danger is of course car accidents). It should also be noted that I think a singularity is a lot nearer than your average singularitarian, and think the chance of me dying a non-accidental/non-gory death is really low.
I’m afraid that ‘this discussion’ is not the one I originally intended with this post: do you think it is best to have it here? I’m afraid that people are reading my post as taking a side (perhaps due to a poor title choice) when in fact it is making a comment about the unfortunate certainty people seem to consistently have on both sides of the issue. (Edit: Of course, this post does not present arguments for both sides, but simply attempts to balance the overall debate in a more fair direction.)
Should we nominate a victim to write a post summarizing various good points either for or against signing up for cryonics (not the feasibility of cryonics technologies!) while taking care to realize that preferences vary and various arguments have different weights dependent on subjective interpretations? I would love to nominate Steve Rayhawk because it seems right up his ally but I’m afraid he wouldn’t like to be spotlighted. I would like to nominate Steven Kaas if he was willing. (Carl Shulman also comes to mind but I suspect he’s much too busy.)
(edit) I guess I don’t fully understand how the proposed post would differ from this one (doesn’t it already cover some of the “good points against” part?), and I’ve also always come down on the “no” side more than most people here.
I think I missed some decent points against (one of which is yours) and the ‘good arguments for’ do not seem to have been collected in a coherent fashion. If they were in the same post, written by the same person, then there’s less of a chance that two arguments addressing the same point would talk past each other. I think that you wouldn’t have to suggest a conclusion, and could leave it completely open to debate. I’m willing to bet most people will trust you to unbiasedly and effectively put forth the arguments for both sides. (I mean, what with that great quote about reconstruction from corpses and all.)
I don’t think so—the points in the post stand regardless of the probability Will assigns. Bringing up other beliefs of Will is an ad hominem argument. Ad hominem is a pretty good argument in the absence of other evidence, but we don’t need to go there today.
The point is simply that if people have widely varying estimates of how likely cryo is to work (0.000001 versus say 0.05 for Robin Hanson and say 0.1 for me), we should straighten those out before getting on to other stuff, like whether it is plausible to rationally reject it. It just seems silly to me that the debate goes on in spite of no effort to agree on this crucial parameter.
If Will’s probability is correct, then I fail to see how his post makes sense: it wouldn’t make sense for anyone to pay for cryo.
If Will’s probability is correct, then I fail to see how his post makes sense: it wouldn’t make sense for anyone to pay for cryo.
Once again, my probability estimate was for myself. There are important subjective considerations, such as age and definition of identity, and important sub-disagreements to be navigated, such as AI takeoff speed or likelihood of Friendliness. If I was 65 years old, and not 18 like I am, and cared a lot about a very specific me living far into the future, which I don’t, and believed that a singularity was in the distant future, instead of the near-mid future as I actually believe, then signing up for cryonics would look a lot more appealing, and might be the obviously rational decision to make.
Most people who are considering cryo here are within 10 years of your age. In particular, I am only 7 years older. 7 years doesn’t add up to moving from 0.0000001 to 0.1, so one of us has a false belief.
What?! Roko, did you seriously not see the two points I had directly after the one about age? Especially the second one?! How is my lack of a strong preference to stay alive into the distant future a false preference? Because it’s not a false belief.
I agree with you that not wanting to be alive in the distant future is a valid reason to not sign up for cryo, and I think that if that’s what you want, then you’re correct to not sign up.
Okay. Like I said, the one in a million thing is for myself. I think that most people, upon reflection (but not so much reflection as something like CEV requires), really would like to live far into the future, and thus should have probabilities much higher than 1 in a million.
How is the probability dependent upon whether you want to live into the future? Surely either you get revived or not? Or do you mean something different than I do by this probability? Do you mean something different than I do by the term “probability”?
We were talking about the probability of getting ‘saved’, and ‘saved’ to me requires that the future is suited such that I will upon reflection be thankful that I was revived instead of those resources being used for something else I would have liked to happen. In the vast majority of post-singularity worlds I do not think this will be the case. In fact, in the vast majority of post-singularity worlds, I think cryonics becomes plain irrelevant. And hence my sorta-extreme views on the subject.
I tried to make it clear in my post and when talking to both you and Vladimir Nesov that I prefer talking about ‘probability that I will get enough utility to justify cryonics upon reflection’ instead of ‘probability that cryonics will result in revival, independent of whether or not that will be considered a good thing upon reflection’. That’s why I put in the abnormally important footnote.
Oh, I see, my bad, apologies for the misunderstanding.
In which case, I ask: what is your probability that if you sign up for cryo now, you will be cryopreserved and revived (i.e. that your brain-state will be faithfully restored)? (This being something that you and I ought to agree on, and ought to be roughly the same replacing “Will” with “Roko”)
Cool, I’m glad to be talking about the same thing now! (I guess any sort of misunderstanding/argument causes me a decent amount of cognitive burden that I don’t realize was there until after it is removed. Maybe a fear of missing an important point that I will be embarrassed about having ignored upon reflection. I wonder if Steve Rayhawk experiences similar feelings on a normal basis?)
Well here’s a really simple, mostly qualitative analysis, with the hope that “Will” and “Roko” should be totally interchangeable.
Option 1: Will signs up for cryonics.
uFAI is developed before Will is cyopreserved. Signing up for cryonics doesn’t work, but this possibility has no significantness in our decision theory anyway.
uFAI is developed after Will is cryopreserved. Signing up for cryonics doesn’t work, but this possibility has no significantness in our decision theory anyway.
FAI is developed before Will is cryopreserved. Signing up for cryonics never gets a chance to work for Will specifically.
FAI is developed after Will is cryopreserved. Cryonics might work, depending on the implementation and results of things like CEV. This is a huge question mark for me. Something close to 50% is probably appropriate, but at times I have been known to say something closer to 5%, based on considerations like ‘An FAI is not going to waste resources reviving you: rather, it will spend resources on fulfilling what it expects your preferences probably were. If your preferences mandate you being alive, then it will do so, but I suspect that most humans upon much reflection and moral evolution won’t care as much about their specific existence.’ Anna Salamon and I think Eliezer suspect that personal identity is closer to human-ness than e.g. Steve Rayhawk and I do, for what it’s worth.
An existential risk occurs before Will is cryopreserved. Signing up for cryonics doesn’t work, but this possibility has no significantness in our decision theory anyway.
An existential risk occurs after Will is cryopreserved. Signing up for cryonics doesn’t work, but this possibility has no significantness in our decision theory anyway.
Option 2: Will does not sign up for cryonics.
uFAI is developed before Will dies. This situation is irrelevant to our decision theory.
uFAI is developed after Will dies. This situation is irrelevant to our decision theory.
FAI is developed before Will dies. This situation is irrelevant to our decision theory.
FAI is developed after Will dies. Because Will was not cryopreserved the FAI does not revive him in the typical sense. However, perhaps it can faithfully restore Will’s brain-state from recordings of Will in the minds of humanity anyway, if that’s what humanity would want. Alternatively Will is revived in ancestor simulations done by the FAI or any other FAI that is curious about humanity’s history around the time right before its singularity. Measure is really important here, so I’m confused. I suspect less but not orders of magnitude less than the 50% figure above? This is an important point.
An existential risk occurs and Will dies. This possibility has no significantness in our decision theory anyway.
An existential risk occurs and Will dies. This possibility has no significantness in our decision theory anyway.
Basically, the point is that the most important factor by far is what an FAI does after going FOOM, and we don’t really know what’s going to happen there. So cryonics becomes a matter of preference more than a matter of probability. But if you’re thinking about worlds that our decision theory discounts, e.g. where a uFAI is developed or rogue MNT is developed, then the probability of being revived drops a lot.
You could still actually give a probability that you’ll get revived. Yes, I agree that knowing what the outcome of AGI is is extremely important, but you should still just have a probability for that.
Well, that gets tricky, because I have weak subjective evidence that I can’t share with anyone else, and really odd ideas about it, that makes me think that an FAI is the likely outcome. (Basically, I suspect something sorta kinda a little along the lines of me living in a fun theory universe. Or more precisely, I am a sub-computation of a longer computation that is optimized for fun, so that even though my life is sub-optimal at the moment I expect it to get a lot better in the future, and that the average of the whole computation’s fun will turned out to be argmaxed. Any my life right now rocks pretty hard anyway. I suspect other people have weaker versions of this [with different evidence from mine] with correspondingly weaker probability estimates for this kind of thing happening.) So if we assume with p=1 that a positive singularity will occur for sake of ease, that leaves about 2% that cryonics will work (5% that an FAI raises the cryonic dead minus 3% that an FAI raises all the dead) if you die times the probability that you die before the singularity (about 15% for most people [but about 2% for me]) which leads to 0.3% as my figure for someone with a sense of identity far stronger than me, Kaj, and many others, who would adjust downward from there (an FAI can be expected to extrapolate our minds and discover it should use the resources on making 10 people with values similar to ourself instead, or something). If you say something like 5% positive singularity instead, then it comes out to 0.015%, or very roughly 1 in 7000 (although of course your decision theory should discount worlds in which you die no matter what anyway, so that the probability of actually living past the singularity shouldn’t change your decision to sign up all that much). I suspect someone with different intuitions would give a very different answer, but it’ll be hard to make headway in debate because it really is so non-technical. The reason I give extremely low probabilities for myself is due to considerations that apply to me only and that I’d rather not go into.
The ideas about fun theory are crazy talk indeed, but they’re sort of tangential to my main points. I have much crazier ideas peppered throughout the comments of this post (very silly implications of decision theory in a level 4 multiverse that are almost assuredly wrong but interesting intuition pumps) and even crazier ideas in the notes I write to myself. Are you worried that this will lead to some sort of mental health danger, or what? I don’t know how often high shock levels damage one’s sanity to an appreciable degree.
I have much crazier ideas peppered throughout the comments of this post (very silly implications of decision theory in a level 4 multiverse that are almost assuredly wrong but interesting intuition pumps) and even crazier ideas in the notes I write to myself. Are you worried that this will lead to some sort of mental health danger, or what? I don’t know how often high shock levels damage one’s sanity to an appreciable degree.
It’s not “shock levels” which are a problem, it’s working in the “almost assuredly wrong” mode. If you yourself believe ideas you develop to be wrong, are they knowledge, are they progress? Do crackpots have “damaged sanity”?
It’s usually better to develop ideas on as firm ground as possible, working towards the unknown from statements you can rely on. Even in this mode will you often fail, but you’d be able to make gradual progress that won’t be illusory. Not all questions are ready to be answered (or even asked).
For what it’s worth the uncertain future application gives me 99% chance of a singularity before 2070 if I recall correctly. The mean of my distrubution is 2028.
I really wish more SIAI members talked to each other about this! Estimates vary wildly, and I’m never sure if people are giving estimates taking into account their decision theory or not (that is, thinking ‘We couldn’t prevent a negative singularity if it was to occur in the next 10 years, so let’s discount those worlds and exclude them from our probability estimates’.) I’m also not sure if people are giving far-off estimates because they don’t want to think about the implications otherwise, or because they tried to build an FAI and it didn’t work, or because they want to signal sophistication and sophisticated people don’t predict crazy things happening very soon, or because they are taking an outside view of the problem, or because they’ve read the recent publications at the AGI conferences and various journals, thought about advances that need to be made, estimated the rate of progress, and determined a date using the inside view (like Steve Rayhawk who gives a shorter time estimate than anyone else, or Shane Legg who I’ve heard also gives a short estimate but I am not sure about that, or Ben Goertzel who I am again not entirely sure about, or Juergen Schmidhuber who seems to be predicting it soonish, or Eliezer who used to have a soonish estimate with very wide tails but I have no idea what his thoughts are now). I’ve heard the guys at FHI also have distant estimates, and a lot of narrow AI people predict far-off AGI as well. Where are the ‘singularity is far’ people getting their predictions?
The problem with the uncertain future is that it is a model of reality which allows you to play with the parameters of the model, but not the structure. For example, it has no option for “model uncertainty”, e.g. the possibility that the assumptions it makes about forms of probability distributions are incorrect. And a lot of these assumptions were made for the sake of tractability rather than realism. I think that the best way to use it is as an intuition pump for your own model, which you could make in excel or in your head.
Giving probabilities of 99% is a classic symptom of not having any model uncertainty.
Giving probabilities of 99% is a classic symptom of not having any model uncertainty.
If Nick and I write some more posts I think this would be the theme. Structural uncertainty is hard to think around.
Anyway, I got my singularity estimations by listening to lots of people working at SIAI and seeing whose points I found compelling. When I arrived at Benton I was thinking something like 2055. It’s a little unsettling that the more arguments I hear from both sides the nearer in the future my predictions are. I think my estimates are probably too biased towards Steve Rayhawk’s, but this is because everyone else’s estimates seem to take the form of outside view considerations that I find weak.
(5% that an FAI raises the cryonic dead minus 3% that an FAI raises all the dead)
This seems to rely on your idea that, on reflection, humans probably don’t care about themselves, i.e. if I reflected sufficiently hard, I would place zero terminal value on my own life.
I wonder how you’re so confident about this? Like, 95% confident that all humans would place zero terminal value their own lives?
Note also that it is possible that some but not all people would, on reflection, place zero value on their own lives.
if I reflected sufficiently hard, I would place zero terminal value on my own life.
Not even close to zero, but less terminal value than you would assign to other things that an FAI could optimize for. I’m not sure how much extrapolated unity of mankind there would be on this regard. I suspect Eliezer or Anna would counter my 5% with a 95%, and I would Aumann to some extent, but I was giving my impression and not belief. (I think that this is better practice at the start of a ‘debate’: otherwise you might update on the wrong expected evidence. EDIT: To be more clear, I wouldn’t want to update on Eliezer’s evidence if it was some sort of generalization from fictional evidence from Brennan’s world or something, but I would want to update if he had a strong argument that identity has proven to be extremely important to all of human affairs since the dawn of civilization, which is entirely plausible.)
It seems odd to me that out of the 10^40 atoms in the solar system, there would not be any left to revive cryo patients. My impression is that FAI would revive cryo patients, with probability 80%, the remaining 20% being for very odd scenarios that I just can’t think of.
I guess I’m saying the amount of atoms it takes to revive a cryo patient is vastly more wasteful than its weight in computronium. You’re trading off one life for a huge amount of potential lives. A few people, like Alicorn if I understand her correctly, think that people who are already alive are worth a huge number of potential lives, but I don’t quite understand that intuition. Is this a point of disagreement for us?
Yeah, but the cryo patient could be run in software rather than in hardware, which would mean that it would be a rather insignificant amount of extra effort.
Gah, sorry, I keep leaving things out. I’m thinking about the actual physical finding out where cryo patients are, scanning their brains, repairing the damage, and then running them. Mike Blume had a good argument against this point: proportionally, the startup cost of scanning a brain is not much at all compared to the infinity of years of actually running the computation. This is where I should be doing the math… so I’m going to think about it more and try and figure things out. Another point is that an AGI could gain access to infinite computing power in finite time during which it could do everything, but I think I’m just confused about the nature of computations in a Tegmark multiverses here.
actual physical finding out where cryo patients are, scanning their brains, repairing the damage, and then running them.
I hadn’t thought of that; certainly if the AI’s mission was to run as many experience-moments as possible in the amount of space-time-energy it had, then it wouldn’t revive cryo patients.
Note that the same argument says that it would kill all existing persons rather than upload them, and re-use their mass and energy to run ems of generic happy people (maximizing experience moments without regard to any deontological constraints has some weird implications...)
Yes, but this makes people flustered so I prefer not to bring it up as a possibility. I’m not sure if it was Bostrom or just generic SIAI thinking where I heard that an FAI might deconstruct us in order to go out into the universe, solve the problem of astronomical waste, and then run computations of us (or in this case generic transhumans) far in the future.
Of course at this point, the terminology “Friendly” becomes misleading, and we should talk about a Goal-X-controlled-AGI, where Goal X is a variable for the goal that that AGI would optimize for.
There is no unique value for X. Some have suggested the output of CEV as the goal system, but if you look at CEV in detail, you see that it is jam-packed with parameters, all of which make a difference to the actual output.
I would personally lobby against the idea of an AGI that did crazy shit like killing existing people to save a few nanoseconds.
Hm, I’ve noticed before that the term ‘Friendly’ is sort of vague. What would I call an AI that optimizes strictly for my goals (and if I care about others’ goals, so be it)? A Will-AI? I’ve said a few times ‘your Friendly is not my Friendly’ but I think I was just redefining Friendliness in an incorrect way that Eliezer wouldn’t endorse.
What would I call an AI that optimizes strictly for my goals...A Will-AI?
One could say “Friendly towards Will.”
But the problem of nailing down your goals seems to me much harder than the problem of negotiating goals between different people. Thus I don’t see a problem of being vague about the target of Friendliness.
But the problem of nailing down your goals seems to me much harder than the problem of negotiating goals between different people. Thus I don’t see a problem of being vague about the target of Friendliness.
Agreed. And asking the question of what is preference of a specific person, represented in some formal language, seems to be a natural simplification of the problem statement, something that needs to be understood before the problem of preference aggregation can be approached.
but I think I was just redefining Friendliness in an incorrect way that Eliezer wouldn’t endorse.
Beware of the urge to censor thoughts that disagree with authority. I personally agree that there is a serious issue here—the issue of moral antirealism, which implies that there is no “canonical human notion of goodness”, so the terminology “Friendly AI” is actually somewhat misleading, and it might be better to say “average human extrapolated morality AGI” when that’s what we want to talk about, e.g.
“an average human extrapolated morality AGI would oppose a paperclip maximizer”.
Then it sounds less onerous to say that you disagree with what an average human extrapolated morality AGI would do than that you disagree with what a “Friendly AI” would do, because most people on this forum disagree with averaged-out human morality (for example, the average human is a theist). Contrast:
“What, you disagree with the FAI? Are you a bad guy then?”
“Friendly AI” is about as specific/ambiguous as “morality”—something humans mostly have in common, allowing for normal variation, not referring to details about specific people. As with preference (morality) of specific people, we can speak of FAI optimizing the world to preference of specific people. Naturally, for each given person it’s preferable to launch a personal-FAI to a consensus-FAI.
However, perhaps it can faithfully restore Will’s brain-state from recordings of Will in the minds of humanity anyway, if that’s what humanity would want. Alternatively Will is revived in ancestor simulations done by the FAI or any other FAI that is curious about humanity’s history around the time right before its singularity.
I am reasonably confident that no such process can produce an entity that I would identify as myself. Being reconstructed from other peoples’ memories means losing the memories of all inner thoughts, all times spent alone, and all times spent with people who have died or forgotten the occasion. That’s too much lost for any sort of continuity of consciousness.
Hm, well we can debate the magic powers a superintelligence possesses (whether or not it can raise the dead), but I think this would make Eliezer sad. I for one am not reasonably confident either way. I am not willing to put bounds on an entity that I am not sure won’t get access to an infinite amount of computation in finite time. At any rate, it seems we have different boundaries around identity. I’m having trouble removing the confusion about identity from my calculations.
There are important subjective considerations, such as age and definition of identity,
Nope, “definition of identity” doesn’t influence what actually happens as a result of your decision, and thus doesn’t influence how good what happens will be.
You are not really trying to figure out “How likely is it to survive as a result of signing up?”, that’s just an instrumental question that is supposed to be helpful, you are trying to figure out which decision you should make.
Nope, “definition of identity” doesn’t influence what actually happens as a result of your decision, and thus doesn’t influence how good what happens will be.
Simply wrong. I can assign positive utility to whatever interpretation of an event I please. If the map changes, the utility changes, even if the territory stays the same. Preferences are not in the territory. Did I misunderstand you?
EDIT: Ah, I think I know what happened: Roko and I were talking about the probability of me being ‘saved’ by cryonics in the thread he linked to, but perhaps you missed that. Let me copy/paste something I said from this thread: “I tried to make it clear in my post and when talking to both you and Vladimir Nesov that I prefer talking about ‘probability that I will get enough utility to justify cryonics upon reflection’ instead of ‘probability that cryonics will result in revival, independent of whether or not that will be considered a good thing upon reflection’. That’s why I put in the abnormally important footnote.” I don’t think I emphasized this enough. My apologies. (I feel silly, because without this distinction you’ve probably been thinking I’ve been committing the mind projection fallacy this whole time, and I didn’t notice.)
You are not really trying to figure out “How likely is it to survive as a result of signing up?”, that’s just an instrumental question that is supposed to be helpful, you are trying to figure out which decision you should make.
Not sure I’m parsing this right. Yes, I am determining what decision I should make. The instrumental question is a part of that, but it is not the only consideration.
I can assign positive utility to whatever interpretation of an event I please. If the map changes, the utility changes, even if the territory stays the same. Preferences are not in the territory. Did I misunderstand you?
You haven’t misunderstood me, but you need to pay attention to this question, because it’s more or less a consensus on Less Wrong that your position expressed in the above quote is wrong. You should maybe ask around for clarification of this point, if you don’t get a change of mind from discussion with me.
You may try the metaethics sequence, and also/in particular these posts:
That preference is computed in the mind doesn’t make it any less of territory than anything else. This is just a piece of territory that happens to be currently located in human minds. (Well, not quite, but to a first approximation.)
Your map may easily change even if the territory stays the same. This changes your belief, but this change doesn’t influence what’s true about the territory. Likewise, your estimate of how good situation X is may change, once you process new arguments or change your understanding of the situation, for example by observing new data, but that change of your belief doesn’t influence how good X actually is. Morality is not a matter of interpretation.
Before I spend a lot of effort trying to figure out where I went wrong (which I’m completely willing to do, because I read all of those posts and the metaethics sequence and figured I understood them), can you confirm that you read my EDIT above, and that the misunderstanding addressed there does not encompass the problem?
Now I have read the edit, but it doesn’t seem to address the problem. Also, I don’t see what you can use the concepts you bring up for, like “probability that I will get enough utility to justify cryonics upon reflection”. If you expect to believe something, you should just believe it right away. See Conservation of expected evidence. But then, “probability this decision is right” is not something you can use for making the decision, not directly.
Also, I don’t see what you can use the concepts you bring up for, like “probability that I will get enough utility to justify cryonics upon reflection”.
This might not be the most useful concept, true, but the issue at hand is the meta-level one of people’s possible overconfidence about it.
“Probability of signing up being good”, especially obfuscated with “justified upon infinite reflection”, being subtly similar to “probability of the decision to sign up being correct”, is too much of a ruse to use without very careful elaboration. A decision can be absolutely, 99.999999% correct, while the probability of it being good remains at 1%, both known to the decider.
So you read footnote 2 of the post and do not think it is a relevant and necessary distinction? And you read Steven’s comment in the other thread where it seems he dissolved our disagreement and determined we were talking about different things?
I know about the conservation of expected evidence. I understand and have demonstrated understanding of the content in the various links you’ve given me. I really doubt I’ve been making the obvious errors you accuse me of for the many months I’ve been conversing with people at SIAI (and at Less Wrong meetups and at the decision theory workshop) without anyone noticing.
Here’s a basic summary of what you seem to think I’m confused about: There is a broad concept of identity in my head. Given this concept of identity I do not want to sign up for cryonics. If this concept of identity changed such that the set of computations I identified with became smaller, then cryonics would become more appealing. I am talking about the probability of expected utility, not the probability of an event. The first is in the map (even if the map is in the territory, which I realize, of course), the second is in the territory.
EDIT: I am treating considerations about identity as a preference: whether or not I should identify with any set of computations is my choice, but subject to change. I think that might be where we disagree: you think everybody will eventually agree what identity is, and that it will be considered a fact about which we can assign different probabilities, but not something subjectively determined.
I am treating considerations about identity as a preference: whether or not I should identify with any set of computations is my choice, but subject to change. I think that might be where we disagree: you think everybody will eventually agree what identity is, and that it will be considered a fact about which we can assign different probabilities, but not something subjectively determined.
That preference is yours and yours alone, without any community to share it, doesn’t make its content any less of a fact than if you’d had a whole humanity of identical people to back it up. (This identity/probability discussion is tangential to a more focused question of correctness of choice.)
The easiest step is for you to look over the last two paragraphs of this comment and see if you agree with that. (Agree/disagree in what sense, if you suspect essential interpretational ambiguity.)
I don’t know why you brought up the concept of identity (or indeed cryonics) in the above, it wasn’t part of this particular discussion.
At first glance and 15 seconds of thinking, I agree, but: “but that change of your belief doesn’t influence how good X actually is” is to me more like “but that change of your belief doesn’t influence how good X will be considered upon an infinite amount of infinitely good reflection”.
Now try to figure out what does the question “What color the sky actually is?” mean, when compared with “How good X actually is?” and your interpretation “How good will X seem after infinite amount of infinitely good reflection”. The “infinitely good reflection” thing is a surrogate for the fact itself, no less in the first case, and no more in the second.
If you essentially agree that there is fact of the matter about whether a given decision is the right one, what did you mean by the following?
I can assign positive utility to whatever interpretation of an event I please. If the map changes, the utility changes, even if the territory stays the same. Preferences are not in the territory.
You can’t “assign utility as you please”, this is not a matter of choice. The decision is either correct or it isn’t, and you can’t make it correct or incorrect by willing so. You may only work on figuring out which way it is, like with any other fact.
Edit: adding a sentence in bold that is really important but that I failed to notice the first time. (Nick Tarleton alerted me to an error in this comment that I needed to fix.)
Any intelligent agent will discover that the sky is blue. Not every intelligent agent will think that the blue sky is equally beautiful. Me, I like grey skies and rainy days. If I discover that I actually like blue skies at a later point, then that changes the perceived utility of seeing a grey sky relative to a blue one. The simple change in preference also changes my expected utility. Yes, maybe the new utility was the ‘correct’ utility all along, but how is that an argument against anything I’ve said in my posts or comments? I get the impression you consistently take the territory view where I take the map view, and I further think that the map view is way more useful for agents like me that aren’t infinitely intelligent nor infinitely reflective. (Nick Tarleton disagrees about taking the map view and I am now reconsidering. He raises the important point that taking the territory view doesn’t mean throwing out the map, and gives the map something to be about. I think he’s probably right.)
You may only work on figuring out which way it is, like with any other fact.
And the way one does this is by becoming good at luminosity and discovering what one’s terminal values are. Yeah, maybe it turns out sufficiently intelligent agents all end up valuing the exact same thing, and FAI turns out to be really easy, but I do not buy it as an assertion.
And the way one does this is by becoming good at luminosity and discovering what one’s terminal values are. Yeah, maybe it turns out sufficiently intelligent agents all end up valuing the exact same thing, and FAI turns out to be really easy, but I do not buy it as an assertion.
This reads to me like
To figure out the weight of a person, we need to develop experimental procedures, make observations, and so on. Yes, maybe it turns out that “weight of a person” is a universal constant and that all experimenters will agree that it’s exactly 80 kg in all cases, and weighting people will thus turn out really easy, but I don’t buy this assertion.
See the error? That there are moral facts doesn’t imply that everyone’s preference is identical, that “all intelligent agents” will value the same thing. Every sane agent should agree on what is moral, but not every sane agent is moved by what is moral, some may be moved by what is prime or something, while agreeing with you that what is prime is often not moral. (See also this comment.)
I’m a little confused about your “weight of a person” example because ‘a’ is ambiguous in English. Did you mean one specific person, or the weighing of different people?
Every sane agent should agree on what is moral
What if CEV doesn’t exist, and there really are different groups of humans with different values? Is one set of values “moral” and the other “that other human thing that’s analogous to morality but isn’t morality”? Primeness is so different from morality that it’s clear we’re talking about two different things. But say we take what you’re calling morality and modify it very slightly, only to the point where many humans still hold to the modified view. It’s not clear to me that the agents will say “I’m moved by this modified view, not morality”. Why wouldn’t they say “No, this modification is the correct morality, and I am moved by morality!”
I have read the metaethics sequence but don’t claim to fully understand it, so feel free to point me to a particular part of it.
What if CEV doesn’t exist, and there really are different groups of humans with different values?
Of course different people have different values. These values might be similar, but they won’t be identical.
Primeness is so different from morality that it’s clear we’re talking about two different things.
Yes, but what is “prime number”? Is it 5, or is it 7? 5 is clearly different from 7, although it’s very similar to it in that it’s also prime. Use the analogy of prime=moral and 5=Blueberry’s values, 7=Will’s values.
It’s not clear to me that the agents will say “I’m moved by this modified view, not morality”. Why wouldn’t they say “No, this modification is the correct morality, and I am moved by morality!”
Because that would be pointless disputing of definitions—clearly, different things are meant by word “morality” in your example.
Yes, but what is “prime number”? Is it 5, or is it 7? 5 is clearly different from 7, although it’s very similar to it in that it’s also prime. Use the analogy of prime=moral and 5=Blueberry’s values, 7=Will’s values.
I see your point, but there is an obvious problem with this analogy: prime and nonprime are two discrete categories. But we can consider a continuum of values, ranging from something almost everyone agrees is moral, through values that are unusual or uncommon but still recognized as human values, all the way to completely alien values like paperclipping.
My concern is that it’s not clear where in the continuum the values stop being “moral” values, unlike with prime numbers.
My concern is that it’s not clear where in the continuum the values stop being “moral” values, unlike with prime numbers.
It might be unclear where the line lies, but it shouldn’t make the concept itself “fuzzy”, merely not understood. What we talk about when we refer to a certain idea is always something specific, but it’s not always clear what is implied by what we talk about. That different people can interpret the same words as referring to different ideas doesn’t make any of these different ideas undefined. The failure to interpret the words in the same way is a failure of communication, not a characterization of the idea that failed to be communicated.
I of course agree that “morality” admits a lot of similar interpretations, but I’d venture to say that “Blueberry’s preference” does as well. It’s an unsolved problem—a core question of Friendly AI—to formally define any of the concepts interpreting these words in a satisfactory way. The fuzziness in communication and elusiveness in formal understanding are relevant equally for the aggregate morality and personal preference, and so the individual/aggregate divide is not the point that particularly opposes the analogy.
Do you think there is a clear line between what humans in general value (morality) and what other entities might value, and we just don’t know where it is? Let’s call the other side of the line ‘schmorality’. So a paperclipper’s values are schmoral.
Is it possible that a human could have values on the other side of the line (schmoral values)?
Suppose another entity, who is on the other side of the line, has a conversation with a human about a moral issue. Both entities engage in the same kind of reasoning, use the same kind of arguments and examples, so why is one reasoning called “moral reasoning” and the other just about values (schmoral reasoning)?
Suppose I am right on the edge of the line. So my values are moral values, but a slight change makes these values schmoral values. From my point of view, these two sets of values are very close. Why do you give them completely different categories? And suppose my values change slightly over time, so I cross the line and back within a day. Do I suddenly stop caring about morality, then start again? This discontinuity seems very strange to me.
I don’t say that any given concept is reasonable for all purposes, just that any concept has a very specific intended meaning, at the moment it’s considered. The concept of morality can be characterized as, roughly, referring to human-like preference, or aggregate preference of humanity-like collections of individual preferences—this is a characterization resilient to some measure of ambiguity in interpretation. The concepts themselves can’t be negotiated, they are set in stone by their intended meaning, though a different concept may be better for a given purpose.
I don’t say that any given concept is reasonable for all purposes, just that any concept has a very specific intended meaning, at the moment it’s considered. The concept of morality can be characterized as, roughly, referring to human-like preference
If you essentially agree that there is fact of the matter about whether a given decision is the right one, what did you mean by the following?
I can assign positive utility to whatever interpretation of an event I please. If the map changes, the utility changes, even if the territory stays the same. Preferences are not in the territory.
In this exchange
If Will’s probability is correct, then I fail to see how his post makes sense: it wouldn’t make sense for anyone to pay for cryo.
There are important subjective considerations, such as age and definition of identity,
Nope, “definition of identity” doesn’t influence what actually happens as a result of your decision, and thus doesn’t influence how good what happens will be.
Will, by “definition of identity”, meant a part of preference, making the point that people might have varying preferences (this being the sense in which preference is “subjective”) that make cryonics a good idea for some but not others. He read your response as a statement of something like moral realism/externalism; he intended his response to address this, though it was phrased confusingly.
That would be a potentially defensible view (What are the causes of variation? How do we know it’s there?), but I’m not sure it’s Will’s (and using the word “definition” in this sense goes very much against the definition of “definition”).
If Will’s probability is correct, then I fail to see how his post makes sense: it wouldn’t make sense for anyone to pay for cryo.
Similar to what I think JoshuaZ was getting at, signing up for cryonics is a decently cheap signal of your rationality and willingness to take weird ideas seriously, and it’s especially cheap for young people like me who might never take advantage of the ‘real’ use of cryonics.
Expected utility doesn’t hold because you can use the money to give yourself more than a + 1 in a million chance of survival to the singularity, for example by buying 9000 lottery tickets and funding SIAI if you win.
That really depends a lot on the expected utility. Moreover, argument 2 above (getting people to think about long-term prospects) has little connection to the value of p.
In biology, individual self-preservation is a emergent subsidiary goal—what is really important is genetic self-preservation.
Organisms face a constant trade-off—whether to use resources now to reproduce, or whether to invest them in self-perpetuation—in the hope of finding a better chance to reproduce in the future.
Calorie restriction and cryonics are examples of this second option—sacrificing current potential for the sake of possible future gains.
Organisms face a constant trade-off—whether to use resources now to reproduce, or whether to invest them in self-perpetuation—in the hope of finding a better chance to reproduce in the future.
Evolution faces this trade-off. Individual organisms are just stuck with trade-offs already made, and (if they happen to be endowed with explicit motivations) may be motivated by something quite other than “a better chance to reproduce in the future”.
Organisms choose—e.g. they choose whether to do calorie restriction—which diverts resources from reproductive programs to maintenance ones. They choose whether to divert resources in the direction of cryonics companies as well.
I’m not disputing that organisms choose. I’m disputing that organisms necessarily have reproductive programs. (You can only face a trade-off between two goals if you value both goals to start with.) Some organisms may value self-preservation, and value reproduction not at all (or only insofar as they view it as a form of self-preservation).
Just noting that buried in the comments Will has stated that he thinks the probability that cryo will actually save your life is one in a million -- 10^-6 -- (with some confusion surrounding the technicalities of how to actually assign that and deal with structural uncertainty).
I think that we need to iron out a consensus probability before this discussion continues.
Edit: especially since if this probability is correct, then the post no longer makes sense…
Correction: not ‘you’, me specifically. I’m young, phyisically and psychologically healthy, and rarely find myself in situations where my life is in danger (the most obvious danger is of course car accidents). It should also be noted that I think a singularity is a lot nearer than your average singularitarian, and think the chance of me dying a non-accidental/non-gory death is really low.
I’m afraid that ‘this discussion’ is not the one I originally intended with this post: do you think it is best to have it here? I’m afraid that people are reading my post as taking a side (perhaps due to a poor title choice) when in fact it is making a comment about the unfortunate certainty people seem to consistently have on both sides of the issue. (Edit: Of course, this post does not present arguments for both sides, but simply attempts to balance the overall debate in a more fair direction.)
Indeed, perhaps not the best place to discuss. But it is worth thinking about this as it does make a difference to the point at issue.
Should we nominate a victim to write a post summarizing various good points either for or against signing up for cryonics (not the feasibility of cryonics technologies!) while taking care to realize that preferences vary and various arguments have different weights dependent on subjective interpretations? I would love to nominate Steve Rayhawk because it seems right up his ally but I’m afraid he wouldn’t like to be spotlighted. I would like to nominate Steven Kaas if he was willing. (Carl Shulman also comes to mind but I suspect he’s much too busy.)
(edit) I guess I don’t fully understand how the proposed post would differ from this one (doesn’t it already cover some of the “good points against” part?), and I’ve also always come down on the “no” side more than most people here.
I think I missed some decent points against (one of which is yours) and the ‘good arguments for’ do not seem to have been collected in a coherent fashion. If they were in the same post, written by the same person, then there’s less of a chance that two arguments addressing the same point would talk past each other. I think that you wouldn’t have to suggest a conclusion, and could leave it completely open to debate. I’m willing to bet most people will trust you to unbiasedly and effectively put forth the arguments for both sides. (I mean, what with that great quote about reconstruction from corpses and all.)
I don’t think so—the points in the post stand regardless of the probability Will assigns. Bringing up other beliefs of Will is an ad hominem argument. Ad hominem is a pretty good argument in the absence of other evidence, but we don’t need to go there today.
It wasn’t intended as an ad-hom argument.
The point is simply that if people have widely varying estimates of how likely cryo is to work (0.000001 versus say 0.05 for Robin Hanson and say 0.1 for me), we should straighten those out before getting on to other stuff, like whether it is plausible to rationally reject it. It just seems silly to me that the debate goes on in spite of no effort to agree on this crucial parameter.
If Will’s probability is correct, then I fail to see how his post makes sense: it wouldn’t make sense for anyone to pay for cryo.
Once again, my probability estimate was for myself. There are important subjective considerations, such as age and definition of identity, and important sub-disagreements to be navigated, such as AI takeoff speed or likelihood of Friendliness. If I was 65 years old, and not 18 like I am, and cared a lot about a very specific me living far into the future, which I don’t, and believed that a singularity was in the distant future, instead of the near-mid future as I actually believe, then signing up for cryonics would look a lot more appealing, and might be the obviously rational decision to make.
Most people who are considering cryo here are within 10 years of your age. In particular, I am only 7 years older. 7 years doesn’t add up to moving from 0.0000001 to 0.1, so one of us has a false belief.
What?! Roko, did you seriously not see the two points I had directly after the one about age? Especially the second one?! How is my lack of a strong preference to stay alive into the distant future a false preference? Because it’s not a false belief.
I agree with you that not wanting to be alive in the distant future is a valid reason to not sign up for cryo, and I think that if that’s what you want, then you’re correct to not sign up.
Okay. Like I said, the one in a million thing is for myself. I think that most people, upon reflection (but not so much reflection as something like CEV requires), really would like to live far into the future, and thus should have probabilities much higher than 1 in a million.
How is the probability dependent upon whether you want to live into the future? Surely either you get revived or not? Or do you mean something different than I do by this probability? Do you mean something different than I do by the term “probability”?
We were talking about the probability of getting ‘saved’, and ‘saved’ to me requires that the future is suited such that I will upon reflection be thankful that I was revived instead of those resources being used for something else I would have liked to happen. In the vast majority of post-singularity worlds I do not think this will be the case. In fact, in the vast majority of post-singularity worlds, I think cryonics becomes plain irrelevant. And hence my sorta-extreme views on the subject.
I tried to make it clear in my post and when talking to both you and Vladimir Nesov that I prefer talking about ‘probability that I will get enough utility to justify cryonics upon reflection’ instead of ‘probability that cryonics will result in revival, independent of whether or not that will be considered a good thing upon reflection’. That’s why I put in the abnormally important footnote.
Oh, I see, my bad, apologies for the misunderstanding.
In which case, I ask: what is your probability that if you sign up for cryo now, you will be cryopreserved and revived (i.e. that your brain-state will be faithfully restored)? (This being something that you and I ought to agree on, and ought to be roughly the same replacing “Will” with “Roko”)
Cool, I’m glad to be talking about the same thing now! (I guess any sort of misunderstanding/argument causes me a decent amount of cognitive burden that I don’t realize was there until after it is removed. Maybe a fear of missing an important point that I will be embarrassed about having ignored upon reflection. I wonder if Steve Rayhawk experiences similar feelings on a normal basis?)
Well here’s a really simple, mostly qualitative analysis, with the hope that “Will” and “Roko” should be totally interchangeable.
Option 1: Will signs up for cryonics.
uFAI is developed before Will is cyopreserved. Signing up for cryonics doesn’t work, but this possibility has no significantness in our decision theory anyway.
uFAI is developed after Will is cryopreserved. Signing up for cryonics doesn’t work, but this possibility has no significantness in our decision theory anyway.
FAI is developed before Will is cryopreserved. Signing up for cryonics never gets a chance to work for Will specifically.
FAI is developed after Will is cryopreserved. Cryonics might work, depending on the implementation and results of things like CEV. This is a huge question mark for me. Something close to 50% is probably appropriate, but at times I have been known to say something closer to 5%, based on considerations like ‘An FAI is not going to waste resources reviving you: rather, it will spend resources on fulfilling what it expects your preferences probably were. If your preferences mandate you being alive, then it will do so, but I suspect that most humans upon much reflection and moral evolution won’t care as much about their specific existence.’ Anna Salamon and I think Eliezer suspect that personal identity is closer to human-ness than e.g. Steve Rayhawk and I do, for what it’s worth.
An existential risk occurs before Will is cryopreserved. Signing up for cryonics doesn’t work, but this possibility has no significantness in our decision theory anyway.
An existential risk occurs after Will is cryopreserved. Signing up for cryonics doesn’t work, but this possibility has no significantness in our decision theory anyway.
Option 2: Will does not sign up for cryonics.
uFAI is developed before Will dies. This situation is irrelevant to our decision theory.
uFAI is developed after Will dies. This situation is irrelevant to our decision theory.
FAI is developed before Will dies. This situation is irrelevant to our decision theory.
FAI is developed after Will dies. Because Will was not cryopreserved the FAI does not revive him in the typical sense. However, perhaps it can faithfully restore Will’s brain-state from recordings of Will in the minds of humanity anyway, if that’s what humanity would want. Alternatively Will is revived in ancestor simulations done by the FAI or any other FAI that is curious about humanity’s history around the time right before its singularity. Measure is really important here, so I’m confused. I suspect less but not orders of magnitude less than the 50% figure above? This is an important point.
An existential risk occurs and Will dies. This possibility has no significantness in our decision theory anyway.
An existential risk occurs and Will dies. This possibility has no significantness in our decision theory anyway.
Basically, the point is that the most important factor by far is what an FAI does after going FOOM, and we don’t really know what’s going to happen there. So cryonics becomes a matter of preference more than a matter of probability. But if you’re thinking about worlds that our decision theory discounts, e.g. where a uFAI is developed or rogue MNT is developed, then the probability of being revived drops a lot.
You could still actually give a probability that you’ll get revived. Yes, I agree that knowing what the outcome of AGI is is extremely important, but you should still just have a probability for that.
Well, that gets tricky, because I have weak subjective evidence that I can’t share with anyone else, and really odd ideas about it, that makes me think that an FAI is the likely outcome. (Basically, I suspect something sorta kinda a little along the lines of me living in a fun theory universe. Or more precisely, I am a sub-computation of a longer computation that is optimized for fun, so that even though my life is sub-optimal at the moment I expect it to get a lot better in the future, and that the average of the whole computation’s fun will turned out to be argmaxed. Any my life right now rocks pretty hard anyway. I suspect other people have weaker versions of this [with different evidence from mine] with correspondingly weaker probability estimates for this kind of thing happening.) So if we assume with p=1 that a positive singularity will occur for sake of ease, that leaves about 2% that cryonics will work (5% that an FAI raises the cryonic dead minus 3% that an FAI raises all the dead) if you die times the probability that you die before the singularity (about 15% for most people [but about 2% for me]) which leads to 0.3% as my figure for someone with a sense of identity far stronger than me, Kaj, and many others, who would adjust downward from there (an FAI can be expected to extrapolate our minds and discover it should use the resources on making 10 people with values similar to ourself instead, or something). If you say something like 5% positive singularity instead, then it comes out to 0.015%, or very roughly 1 in 7000 (although of course your decision theory should discount worlds in which you die no matter what anyway, so that the probability of actually living past the singularity shouldn’t change your decision to sign up all that much). I suspect someone with different intuitions would give a very different answer, but it’ll be hard to make headway in debate because it really is so non-technical. The reason I give extremely low probabilities for myself is due to considerations that apply to me only and that I’d rather not go into.
Hmm… Seems like crazy talk to me. It’s your mind, tread softly.
The ideas about fun theory are crazy talk indeed, but they’re sort of tangential to my main points. I have much crazier ideas peppered throughout the comments of this post (very silly implications of decision theory in a level 4 multiverse that are almost assuredly wrong but interesting intuition pumps) and even crazier ideas in the notes I write to myself. Are you worried that this will lead to some sort of mental health danger, or what? I don’t know how often high shock levels damage one’s sanity to an appreciable degree.
It’s not “shock levels” which are a problem, it’s working in the “almost assuredly wrong” mode. If you yourself believe ideas you develop to be wrong, are they knowledge, are they progress? Do crackpots have “damaged sanity”?
It’s usually better to develop ideas on as firm ground as possible, working towards the unknown from statements you can rely on. Even in this mode will you often fail, but you’d be able to make gradual progress that won’t be illusory. Not all questions are ready to be answered (or even asked).
98% certain that the singularity will happen before you die (which could easily be 2070)? This seems like an unjustifiably high level of confidence.
For what it’s worth the uncertain future application gives me 99% chance of a singularity before 2070 if I recall correctly. The mean of my distrubution is 2028.
I really wish more SIAI members talked to each other about this! Estimates vary wildly, and I’m never sure if people are giving estimates taking into account their decision theory or not (that is, thinking ‘We couldn’t prevent a negative singularity if it was to occur in the next 10 years, so let’s discount those worlds and exclude them from our probability estimates’.) I’m also not sure if people are giving far-off estimates because they don’t want to think about the implications otherwise, or because they tried to build an FAI and it didn’t work, or because they want to signal sophistication and sophisticated people don’t predict crazy things happening very soon, or because they are taking an outside view of the problem, or because they’ve read the recent publications at the AGI conferences and various journals, thought about advances that need to be made, estimated the rate of progress, and determined a date using the inside view (like Steve Rayhawk who gives a shorter time estimate than anyone else, or Shane Legg who I’ve heard also gives a short estimate but I am not sure about that, or Ben Goertzel who I am again not entirely sure about, or Juergen Schmidhuber who seems to be predicting it soonish, or Eliezer who used to have a soonish estimate with very wide tails but I have no idea what his thoughts are now). I’ve heard the guys at FHI also have distant estimates, and a lot of narrow AI people predict far-off AGI as well. Where are the ‘singularity is far’ people getting their predictions?
UF is not accurate!
True. But the mean of my distribution is still 2028 regardless of the inaccuracy of UF.
The problem with the uncertain future is that it is a model of reality which allows you to play with the parameters of the model, but not the structure. For example, it has no option for “model uncertainty”, e.g. the possibility that the assumptions it makes about forms of probability distributions are incorrect. And a lot of these assumptions were made for the sake of tractability rather than realism. I think that the best way to use it is as an intuition pump for your own model, which you could make in excel or in your head.
Giving probabilities of 99% is a classic symptom of not having any model uncertainty.
If Nick and I write some more posts I think this would be the theme. Structural uncertainty is hard to think around.
Anyway, I got my singularity estimations by listening to lots of people working at SIAI and seeing whose points I found compelling. When I arrived at Benton I was thinking something like 2055. It’s a little unsettling that the more arguments I hear from both sides the nearer in the future my predictions are. I think my estimates are probably too biased towards Steve Rayhawk’s, but this is because everyone else’s estimates seem to take the form of outside view considerations that I find weak.
This seems to rely on your idea that, on reflection, humans probably don’t care about themselves, i.e. if I reflected sufficiently hard, I would place zero terminal value on my own life.
I wonder how you’re so confident about this? Like, 95% confident that all humans would place zero terminal value their own lives?
Note also that it is possible that some but not all people would, on reflection, place zero value on their own lives.
Not even close to zero, but less terminal value than you would assign to other things that an FAI could optimize for. I’m not sure how much extrapolated unity of mankind there would be on this regard. I suspect Eliezer or Anna would counter my 5% with a 95%, and I would Aumann to some extent, but I was giving my impression and not belief. (I think that this is better practice at the start of a ‘debate’: otherwise you might update on the wrong expected evidence. EDIT: To be more clear, I wouldn’t want to update on Eliezer’s evidence if it was some sort of generalization from fictional evidence from Brennan’s world or something, but I would want to update if he had a strong argument that identity has proven to be extremely important to all of human affairs since the dawn of civilization, which is entirely plausible.)
It seems odd to me that out of the 10^40 atoms in the solar system, there would not be any left to revive cryo patients. My impression is that FAI would revive cryo patients, with probability 80%, the remaining 20% being for very odd scenarios that I just can’t think of.
I guess I’m saying the amount of atoms it takes to revive a cryo patient is vastly more wasteful than its weight in computronium. You’re trading off one life for a huge amount of potential lives. A few people, like Alicorn if I understand her correctly, think that people who are already alive are worth a huge number of potential lives, but I don’t quite understand that intuition. Is this a point of disagreement for us?
Yeah, but the cryo patient could be run in software rather than in hardware, which would mean that it would be a rather insignificant amount of extra effort.
Gah, sorry, I keep leaving things out. I’m thinking about the actual physical finding out where cryo patients are, scanning their brains, repairing the damage, and then running them. Mike Blume had a good argument against this point: proportionally, the startup cost of scanning a brain is not much at all compared to the infinity of years of actually running the computation. This is where I should be doing the math… so I’m going to think about it more and try and figure things out. Another point is that an AGI could gain access to infinite computing power in finite time during which it could do everything, but I think I’m just confused about the nature of computations in a Tegmark multiverses here.
I hadn’t thought of that; certainly if the AI’s mission was to run as many experience-moments as possible in the amount of space-time-energy it had, then it wouldn’t revive cryo patients.
Note that the same argument says that it would kill all existing persons rather than upload them, and re-use their mass and energy to run ems of generic happy people (maximizing experience moments without regard to any deontological constraints has some weird implications...)
Yes, but this makes people flustered so I prefer not to bring it up as a possibility. I’m not sure if it was Bostrom or just generic SIAI thinking where I heard that an FAI might deconstruct us in order to go out into the universe, solve the problem of astronomical waste, and then run computations of us (or in this case generic transhumans) far in the future.
Of course at this point, the terminology “Friendly” becomes misleading, and we should talk about a Goal-X-controlled-AGI, where Goal X is a variable for the goal that that AGI would optimize for.
There is no unique value for X. Some have suggested the output of CEV as the goal system, but if you look at CEV in detail, you see that it is jam-packed with parameters, all of which make a difference to the actual output.
I would personally lobby against the idea of an AGI that did crazy shit like killing existing people to save a few nanoseconds.
Hm, I’ve noticed before that the term ‘Friendly’ is sort of vague. What would I call an AI that optimizes strictly for my goals (and if I care about others’ goals, so be it)? A Will-AI? I’ve said a few times ‘your Friendly is not my Friendly’ but I think I was just redefining Friendliness in an incorrect way that Eliezer wouldn’t endorse.
One could say “Friendly towards Will.”
But the problem of nailing down your goals seems to me much harder than the problem of negotiating goals between different people. Thus I don’t see a problem of being vague about the target of Friendliness.
Agreed. And asking the question of what is preference of a specific person, represented in some formal language, seems to be a natural simplification of the problem statement, something that needs to be understood before the problem of preference aggregation can be approached.
Beware of the urge to censor thoughts that disagree with authority. I personally agree that there is a serious issue here—the issue of moral antirealism, which implies that there is no “canonical human notion of goodness”, so the terminology “Friendly AI” is actually somewhat misleading, and it might be better to say “average human extrapolated morality AGI” when that’s what we want to talk about, e.g.
Then it sounds less onerous to say that you disagree with what an average human extrapolated morality AGI would do than that you disagree with what a “Friendly AI” would do, because most people on this forum disagree with averaged-out human morality (for example, the average human is a theist). Contrast:
“Friendly AI” is about as specific/ambiguous as “morality”—something humans mostly have in common, allowing for normal variation, not referring to details about specific people. As with preference (morality) of specific people, we can speak of FAI optimizing the world to preference of specific people. Naturally, for each given person it’s preferable to launch a personal-FAI to a consensus-FAI.
I am reasonably confident that no such process can produce an entity that I would identify as myself. Being reconstructed from other peoples’ memories means losing the memories of all inner thoughts, all times spent alone, and all times spent with people who have died or forgotten the occasion. That’s too much lost for any sort of continuity of consciousness.
Hm, well we can debate the magic powers a superintelligence possesses (whether or not it can raise the dead), but I think this would make Eliezer sad. I for one am not reasonably confident either way. I am not willing to put bounds on an entity that I am not sure won’t get access to an infinite amount of computation in finite time. At any rate, it seems we have different boundaries around identity. I’m having trouble removing the confusion about identity from my calculations.
You suspect that most people, upon reflection, won’t care whether they live or die? I’m intrigued: what makes you think this?
Nope, “definition of identity” doesn’t influence what actually happens as a result of your decision, and thus doesn’t influence how good what happens will be.
You are not really trying to figure out “How likely is it to survive as a result of signing up?”, that’s just an instrumental question that is supposed to be helpful, you are trying to figure out which decision you should make.
Simply wrong. I can assign positive utility to whatever interpretation of an event I please. If the map changes, the utility changes, even if the territory stays the same. Preferences are not in the territory. Did I misunderstand you?
EDIT: Ah, I think I know what happened: Roko and I were talking about the probability of me being ‘saved’ by cryonics in the thread he linked to, but perhaps you missed that. Let me copy/paste something I said from this thread: “I tried to make it clear in my post and when talking to both you and Vladimir Nesov that I prefer talking about ‘probability that I will get enough utility to justify cryonics upon reflection’ instead of ‘probability that cryonics will result in revival, independent of whether or not that will be considered a good thing upon reflection’. That’s why I put in the abnormally important footnote.” I don’t think I emphasized this enough. My apologies. (I feel silly, because without this distinction you’ve probably been thinking I’ve been committing the mind projection fallacy this whole time, and I didn’t notice.)
Not sure I’m parsing this right. Yes, I am determining what decision I should make. The instrumental question is a part of that, but it is not the only consideration.
You haven’t misunderstood me, but you need to pay attention to this question, because it’s more or less a consensus on Less Wrong that your position expressed in the above quote is wrong. You should maybe ask around for clarification of this point, if you don’t get a change of mind from discussion with me.
You may try the metaethics sequence, and also/in particular these posts:
http://lesswrong.com/lw/s6/probability_is_subjectively_objective/
http://lesswrong.com/lw/si/math_is_subjunctively_objective/
http://lesswrong.com/lw/sj/does_your_morality_care_what_you_think/
http://lesswrong.com/lw/sw/morality_as_fixed_computation/
http://lesswrong.com/lw/t0/abstracted_idealized_dynamics/
That preference is computed in the mind doesn’t make it any less of territory than anything else. This is just a piece of territory that happens to be currently located in human minds. (Well, not quite, but to a first approximation.)
Your map may easily change even if the territory stays the same. This changes your belief, but this change doesn’t influence what’s true about the territory. Likewise, your estimate of how good situation X is may change, once you process new arguments or change your understanding of the situation, for example by observing new data, but that change of your belief doesn’t influence how good X actually is. Morality is not a matter of interpretation.
Before I spend a lot of effort trying to figure out where I went wrong (which I’m completely willing to do, because I read all of those posts and the metaethics sequence and figured I understood them), can you confirm that you read my EDIT above, and that the misunderstanding addressed there does not encompass the problem?
Now I have read the edit, but it doesn’t seem to address the problem. Also, I don’t see what you can use the concepts you bring up for, like “probability that I will get enough utility to justify cryonics upon reflection”. If you expect to believe something, you should just believe it right away. See Conservation of expected evidence. But then, “probability this decision is right” is not something you can use for making the decision, not directly.
This might not be the most useful concept, true, but the issue at hand is the meta-level one of people’s possible overconfidence about it.
“Probability of signing up being good”, especially obfuscated with “justified upon infinite reflection”, being subtly similar to “probability of the decision to sign up being correct”, is too much of a ruse to use without very careful elaboration. A decision can be absolutely, 99.999999% correct, while the probability of it being good remains at 1%, both known to the decider.
So you read footnote 2 of the post and do not think it is a relevant and necessary distinction? And you read Steven’s comment in the other thread where it seems he dissolved our disagreement and determined we were talking about different things?
I know about the conservation of expected evidence. I understand and have demonstrated understanding of the content in the various links you’ve given me. I really doubt I’ve been making the obvious errors you accuse me of for the many months I’ve been conversing with people at SIAI (and at Less Wrong meetups and at the decision theory workshop) without anyone noticing.
Here’s a basic summary of what you seem to think I’m confused about: There is a broad concept of identity in my head. Given this concept of identity I do not want to sign up for cryonics. If this concept of identity changed such that the set of computations I identified with became smaller, then cryonics would become more appealing. I am talking about the probability of expected utility, not the probability of an event. The first is in the map (even if the map is in the territory, which I realize, of course), the second is in the territory.
EDIT: I am treating considerations about identity as a preference: whether or not I should identify with any set of computations is my choice, but subject to change. I think that might be where we disagree: you think everybody will eventually agree what identity is, and that it will be considered a fact about which we can assign different probabilities, but not something subjectively determined.
That preference is yours and yours alone, without any community to share it, doesn’t make its content any less of a fact than if you’d had a whole humanity of identical people to back it up. (This identity/probability discussion is tangential to a more focused question of correctness of choice.)
The easiest step is for you to look over the last two paragraphs of this comment and see if you agree with that. (Agree/disagree in what sense, if you suspect essential interpretational ambiguity.)
I don’t know why you brought up the concept of identity (or indeed cryonics) in the above, it wasn’t part of this particular discussion.
At first glance and 15 seconds of thinking, I agree, but: “but that change of your belief doesn’t influence how good X actually is” is to me more like “but that change of your belief doesn’t influence how good X will be considered upon an infinite amount of infinitely good reflection”.
Now try to figure out what does the question “What color the sky actually is?” mean, when compared with “How good X actually is?” and your interpretation “How good will X seem after infinite amount of infinitely good reflection”. The “infinitely good reflection” thing is a surrogate for the fact itself, no less in the first case, and no more in the second.
If you essentially agree that there is fact of the matter about whether a given decision is the right one, what did you mean by the following?
You can’t “assign utility as you please”, this is not a matter of choice. The decision is either correct or it isn’t, and you can’t make it correct or incorrect by willing so. You may only work on figuring out which way it is, like with any other fact.
Edit: adding a sentence in bold that is really important but that I failed to notice the first time. (Nick Tarleton alerted me to an error in this comment that I needed to fix.)
Any intelligent agent will discover that the sky is blue. Not every intelligent agent will think that the blue sky is equally beautiful. Me, I like grey skies and rainy days. If I discover that I actually like blue skies at a later point, then that changes the perceived utility of seeing a grey sky relative to a blue one. The simple change in preference also changes my expected utility. Yes, maybe the new utility was the ‘correct’ utility all along, but how is that an argument against anything I’ve said in my posts or comments? I get the impression you consistently take the territory view where I take the map view, and I further think that the map view is way more useful for agents like me that aren’t infinitely intelligent nor infinitely reflective. (Nick Tarleton disagrees about taking the map view and I am now reconsidering. He raises the important point that taking the territory view doesn’t mean throwing out the map, and gives the map something to be about. I think he’s probably right.)
And the way one does this is by becoming good at luminosity and discovering what one’s terminal values are. Yeah, maybe it turns out sufficiently intelligent agents all end up valuing the exact same thing, and FAI turns out to be really easy, but I do not buy it as an assertion.
This reads to me like
See the error? That there are moral facts doesn’t imply that everyone’s preference is identical, that “all intelligent agents” will value the same thing. Every sane agent should agree on what is moral, but not every sane agent is moved by what is moral, some may be moved by what is prime or something, while agreeing with you that what is prime is often not moral. (See also this comment.)
I’m a little confused about your “weight of a person” example because ‘a’ is ambiguous in English. Did you mean one specific person, or the weighing of different people?
What if CEV doesn’t exist, and there really are different groups of humans with different values? Is one set of values “moral” and the other “that other human thing that’s analogous to morality but isn’t morality”? Primeness is so different from morality that it’s clear we’re talking about two different things. But say we take what you’re calling morality and modify it very slightly, only to the point where many humans still hold to the modified view. It’s not clear to me that the agents will say “I’m moved by this modified view, not morality”. Why wouldn’t they say “No, this modification is the correct morality, and I am moved by morality!”
I have read the metaethics sequence but don’t claim to fully understand it, so feel free to point me to a particular part of it.
Of course different people have different values. These values might be similar, but they won’t be identical.
Yes, but what is “prime number”? Is it 5, or is it 7? 5 is clearly different from 7, although it’s very similar to it in that it’s also prime. Use the analogy of prime=moral and 5=Blueberry’s values, 7=Will’s values.
Because that would be pointless disputing of definitions—clearly, different things are meant by word “morality” in your example.
I see your point, but there is an obvious problem with this analogy: prime and nonprime are two discrete categories. But we can consider a continuum of values, ranging from something almost everyone agrees is moral, through values that are unusual or uncommon but still recognized as human values, all the way to completely alien values like paperclipping.
My concern is that it’s not clear where in the continuum the values stop being “moral” values, unlike with prime numbers.
It might be unclear where the line lies, but it shouldn’t make the concept itself “fuzzy”, merely not understood. What we talk about when we refer to a certain idea is always something specific, but it’s not always clear what is implied by what we talk about. That different people can interpret the same words as referring to different ideas doesn’t make any of these different ideas undefined. The failure to interpret the words in the same way is a failure of communication, not a characterization of the idea that failed to be communicated.
I of course agree that “morality” admits a lot of similar interpretations, but I’d venture to say that “Blueberry’s preference” does as well. It’s an unsolved problem—a core question of Friendly AI—to formally define any of the concepts interpreting these words in a satisfactory way. The fuzziness in communication and elusiveness in formal understanding are relevant equally for the aggregate morality and personal preference, and so the individual/aggregate divide is not the point that particularly opposes the analogy.
I’m still very confused.
Do you think there is a clear line between what humans in general value (morality) and what other entities might value, and we just don’t know where it is? Let’s call the other side of the line ‘schmorality’. So a paperclipper’s values are schmoral.
Is it possible that a human could have values on the other side of the line (schmoral values)?
Suppose another entity, who is on the other side of the line, has a conversation with a human about a moral issue. Both entities engage in the same kind of reasoning, use the same kind of arguments and examples, so why is one reasoning called “moral reasoning” and the other just about values (schmoral reasoning)?
Suppose I am right on the edge of the line. So my values are moral values, but a slight change makes these values schmoral values. From my point of view, these two sets of values are very close. Why do you give them completely different categories? And suppose my values change slightly over time, so I cross the line and back within a day. Do I suddenly stop caring about morality, then start again? This discontinuity seems very strange to me.
I don’t say that any given concept is reasonable for all purposes, just that any concept has a very specific intended meaning, at the moment it’s considered. The concept of morality can be characterized as, roughly, referring to human-like preference, or aggregate preference of humanity-like collections of individual preferences—this is a characterization resilient to some measure of ambiguity in interpretation. The concepts themselves can’t be negotiated, they are set in stone by their intended meaning, though a different concept may be better for a given purpose.
Thanks! That actually helped a lot.
In this exchange
Will, by “definition of identity”, meant a part of preference, making the point that people might have varying preferences (this being the sense in which preference is “subjective”) that make cryonics a good idea for some but not others. He read your response as a statement of something like moral realism/externalism; he intended his response to address this, though it was phrased confusingly.
That would be a potentially defensible view (What are the causes of variation? How do we know it’s there?), but I’m not sure it’s Will’s (and using the word “definition” in this sense goes very much against the definition of “definition”).
Similar to what I think JoshuaZ was getting at, signing up for cryonics is a decently cheap signal of your rationality and willingness to take weird ideas seriously, and it’s especially cheap for young people like me who might never take advantage of the ‘real’ use of cryonics.
Really? Even if you buy into Will’s estimate, there are at least three arguments that are not weak:
1) The expected utility argument (I presented above arguments for why this fails, but it isn’t completely clear that those rebuttals are valid)
2) One might think that buying into cryonics helps force people (including oneself) to think about the future in a way that produces positive utility.
3) One gets a positive utility from the hope that one might survive using cryonics.
Note that all three of these are fairly standard pro-cryonics arguments that all are valid even with the low probability estimate made by Will.
none of those hold for p = 1 in a million.
Expected utility doesn’t hold because you can use the money to give yourself more than a + 1 in a million chance of survival to the singularity, for example by buying 9000 lottery tickets and funding SIAI if you win.
1 in a million is really small.
That really depends a lot on the expected utility. Moreover, argument 2 above (getting people to think about long-term prospects) has little connection to the value of p.
The point about thinking more about the future with cryo is that you expect to be there.
p=1 in 1 million means you don’t expect to be there.
Even a small chance that you will be there helps put people in the mind-set to think long-term.
Re: “whether it is plausible to rationally reject it”
Of course people can plausibly rationally reject cryonics!
Surely nobody has been silly enough to argue that cryonics makes good financial sense—irrespective of your goals and circumstances.
If your goals don’t include self-preservation, then it is not for you.
In biology, individual self-preservation is a emergent subsidiary goal—what is really important is genetic self-preservation.
Organisms face a constant trade-off—whether to use resources now to reproduce, or whether to invest them in self-perpetuation—in the hope of finding a better chance to reproduce in the future.
Calorie restriction and cryonics are examples of this second option—sacrificing current potential for the sake of possible future gains.
Evolution faces this trade-off. Individual organisms are just stuck with trade-offs already made, and (if they happen to be endowed with explicit motivations) may be motivated by something quite other than “a better chance to reproduce in the future”.
Organisms choose—e.g. they choose whether to do calorie restriction—which diverts resources from reproductive programs to maintenance ones. They choose whether to divert resources in the direction of cryonics companies as well.
I’m not disputing that organisms choose. I’m disputing that organisms necessarily have reproductive programs. (You can only face a trade-off between two goals if you value both goals to start with.) Some organisms may value self-preservation, and value reproduction not at all (or only insofar as they view it as a form of self-preservation).
Not all organisms choose—for example, some have strategies hard-wired into them—and others are broken.