We were talking about the probability of getting ‘saved’, and ‘saved’ to me requires that the future is suited such that I will upon reflection be thankful that I was revived instead of those resources being used for something else I would have liked to happen. In the vast majority of post-singularity worlds I do not think this will be the case. In fact, in the vast majority of post-singularity worlds, I think cryonics becomes plain irrelevant. And hence my sorta-extreme views on the subject.
I tried to make it clear in my post and when talking to both you and Vladimir Nesov that I prefer talking about ‘probability that I will get enough utility to justify cryonics upon reflection’ instead of ‘probability that cryonics will result in revival, independent of whether or not that will be considered a good thing upon reflection’. That’s why I put in the abnormally important footnote.
Oh, I see, my bad, apologies for the misunderstanding.
In which case, I ask: what is your probability that if you sign up for cryo now, you will be cryopreserved and revived (i.e. that your brain-state will be faithfully restored)? (This being something that you and I ought to agree on, and ought to be roughly the same replacing “Will” with “Roko”)
Cool, I’m glad to be talking about the same thing now! (I guess any sort of misunderstanding/argument causes me a decent amount of cognitive burden that I don’t realize was there until after it is removed. Maybe a fear of missing an important point that I will be embarrassed about having ignored upon reflection. I wonder if Steve Rayhawk experiences similar feelings on a normal basis?)
Well here’s a really simple, mostly qualitative analysis, with the hope that “Will” and “Roko” should be totally interchangeable.
Option 1: Will signs up for cryonics.
uFAI is developed before Will is cyopreserved. Signing up for cryonics doesn’t work, but this possibility has no significantness in our decision theory anyway.
uFAI is developed after Will is cryopreserved. Signing up for cryonics doesn’t work, but this possibility has no significantness in our decision theory anyway.
FAI is developed before Will is cryopreserved. Signing up for cryonics never gets a chance to work for Will specifically.
FAI is developed after Will is cryopreserved. Cryonics might work, depending on the implementation and results of things like CEV. This is a huge question mark for me. Something close to 50% is probably appropriate, but at times I have been known to say something closer to 5%, based on considerations like ‘An FAI is not going to waste resources reviving you: rather, it will spend resources on fulfilling what it expects your preferences probably were. If your preferences mandate you being alive, then it will do so, but I suspect that most humans upon much reflection and moral evolution won’t care as much about their specific existence.’ Anna Salamon and I think Eliezer suspect that personal identity is closer to human-ness than e.g. Steve Rayhawk and I do, for what it’s worth.
An existential risk occurs before Will is cryopreserved. Signing up for cryonics doesn’t work, but this possibility has no significantness in our decision theory anyway.
An existential risk occurs after Will is cryopreserved. Signing up for cryonics doesn’t work, but this possibility has no significantness in our decision theory anyway.
Option 2: Will does not sign up for cryonics.
uFAI is developed before Will dies. This situation is irrelevant to our decision theory.
uFAI is developed after Will dies. This situation is irrelevant to our decision theory.
FAI is developed before Will dies. This situation is irrelevant to our decision theory.
FAI is developed after Will dies. Because Will was not cryopreserved the FAI does not revive him in the typical sense. However, perhaps it can faithfully restore Will’s brain-state from recordings of Will in the minds of humanity anyway, if that’s what humanity would want. Alternatively Will is revived in ancestor simulations done by the FAI or any other FAI that is curious about humanity’s history around the time right before its singularity. Measure is really important here, so I’m confused. I suspect less but not orders of magnitude less than the 50% figure above? This is an important point.
An existential risk occurs and Will dies. This possibility has no significantness in our decision theory anyway.
An existential risk occurs and Will dies. This possibility has no significantness in our decision theory anyway.
Basically, the point is that the most important factor by far is what an FAI does after going FOOM, and we don’t really know what’s going to happen there. So cryonics becomes a matter of preference more than a matter of probability. But if you’re thinking about worlds that our decision theory discounts, e.g. where a uFAI is developed or rogue MNT is developed, then the probability of being revived drops a lot.
You could still actually give a probability that you’ll get revived. Yes, I agree that knowing what the outcome of AGI is is extremely important, but you should still just have a probability for that.
Well, that gets tricky, because I have weak subjective evidence that I can’t share with anyone else, and really odd ideas about it, that makes me think that an FAI is the likely outcome. (Basically, I suspect something sorta kinda a little along the lines of me living in a fun theory universe. Or more precisely, I am a sub-computation of a longer computation that is optimized for fun, so that even though my life is sub-optimal at the moment I expect it to get a lot better in the future, and that the average of the whole computation’s fun will turned out to be argmaxed. Any my life right now rocks pretty hard anyway. I suspect other people have weaker versions of this [with different evidence from mine] with correspondingly weaker probability estimates for this kind of thing happening.) So if we assume with p=1 that a positive singularity will occur for sake of ease, that leaves about 2% that cryonics will work (5% that an FAI raises the cryonic dead minus 3% that an FAI raises all the dead) if you die times the probability that you die before the singularity (about 15% for most people [but about 2% for me]) which leads to 0.3% as my figure for someone with a sense of identity far stronger than me, Kaj, and many others, who would adjust downward from there (an FAI can be expected to extrapolate our minds and discover it should use the resources on making 10 people with values similar to ourself instead, or something). If you say something like 5% positive singularity instead, then it comes out to 0.015%, or very roughly 1 in 7000 (although of course your decision theory should discount worlds in which you die no matter what anyway, so that the probability of actually living past the singularity shouldn’t change your decision to sign up all that much). I suspect someone with different intuitions would give a very different answer, but it’ll be hard to make headway in debate because it really is so non-technical. The reason I give extremely low probabilities for myself is due to considerations that apply to me only and that I’d rather not go into.
The ideas about fun theory are crazy talk indeed, but they’re sort of tangential to my main points. I have much crazier ideas peppered throughout the comments of this post (very silly implications of decision theory in a level 4 multiverse that are almost assuredly wrong but interesting intuition pumps) and even crazier ideas in the notes I write to myself. Are you worried that this will lead to some sort of mental health danger, or what? I don’t know how often high shock levels damage one’s sanity to an appreciable degree.
I have much crazier ideas peppered throughout the comments of this post (very silly implications of decision theory in a level 4 multiverse that are almost assuredly wrong but interesting intuition pumps) and even crazier ideas in the notes I write to myself. Are you worried that this will lead to some sort of mental health danger, or what? I don’t know how often high shock levels damage one’s sanity to an appreciable degree.
It’s not “shock levels” which are a problem, it’s working in the “almost assuredly wrong” mode. If you yourself believe ideas you develop to be wrong, are they knowledge, are they progress? Do crackpots have “damaged sanity”?
It’s usually better to develop ideas on as firm ground as possible, working towards the unknown from statements you can rely on. Even in this mode will you often fail, but you’d be able to make gradual progress that won’t be illusory. Not all questions are ready to be answered (or even asked).
For what it’s worth the uncertain future application gives me 99% chance of a singularity before 2070 if I recall correctly. The mean of my distrubution is 2028.
I really wish more SIAI members talked to each other about this! Estimates vary wildly, and I’m never sure if people are giving estimates taking into account their decision theory or not (that is, thinking ‘We couldn’t prevent a negative singularity if it was to occur in the next 10 years, so let’s discount those worlds and exclude them from our probability estimates’.) I’m also not sure if people are giving far-off estimates because they don’t want to think about the implications otherwise, or because they tried to build an FAI and it didn’t work, or because they want to signal sophistication and sophisticated people don’t predict crazy things happening very soon, or because they are taking an outside view of the problem, or because they’ve read the recent publications at the AGI conferences and various journals, thought about advances that need to be made, estimated the rate of progress, and determined a date using the inside view (like Steve Rayhawk who gives a shorter time estimate than anyone else, or Shane Legg who I’ve heard also gives a short estimate but I am not sure about that, or Ben Goertzel who I am again not entirely sure about, or Juergen Schmidhuber who seems to be predicting it soonish, or Eliezer who used to have a soonish estimate with very wide tails but I have no idea what his thoughts are now). I’ve heard the guys at FHI also have distant estimates, and a lot of narrow AI people predict far-off AGI as well. Where are the ‘singularity is far’ people getting their predictions?
The problem with the uncertain future is that it is a model of reality which allows you to play with the parameters of the model, but not the structure. For example, it has no option for “model uncertainty”, e.g. the possibility that the assumptions it makes about forms of probability distributions are incorrect. And a lot of these assumptions were made for the sake of tractability rather than realism. I think that the best way to use it is as an intuition pump for your own model, which you could make in excel or in your head.
Giving probabilities of 99% is a classic symptom of not having any model uncertainty.
Giving probabilities of 99% is a classic symptom of not having any model uncertainty.
If Nick and I write some more posts I think this would be the theme. Structural uncertainty is hard to think around.
Anyway, I got my singularity estimations by listening to lots of people working at SIAI and seeing whose points I found compelling. When I arrived at Benton I was thinking something like 2055. It’s a little unsettling that the more arguments I hear from both sides the nearer in the future my predictions are. I think my estimates are probably too biased towards Steve Rayhawk’s, but this is because everyone else’s estimates seem to take the form of outside view considerations that I find weak.
(5% that an FAI raises the cryonic dead minus 3% that an FAI raises all the dead)
This seems to rely on your idea that, on reflection, humans probably don’t care about themselves, i.e. if I reflected sufficiently hard, I would place zero terminal value on my own life.
I wonder how you’re so confident about this? Like, 95% confident that all humans would place zero terminal value their own lives?
Note also that it is possible that some but not all people would, on reflection, place zero value on their own lives.
if I reflected sufficiently hard, I would place zero terminal value on my own life.
Not even close to zero, but less terminal value than you would assign to other things that an FAI could optimize for. I’m not sure how much extrapolated unity of mankind there would be on this regard. I suspect Eliezer or Anna would counter my 5% with a 95%, and I would Aumann to some extent, but I was giving my impression and not belief. (I think that this is better practice at the start of a ‘debate’: otherwise you might update on the wrong expected evidence. EDIT: To be more clear, I wouldn’t want to update on Eliezer’s evidence if it was some sort of generalization from fictional evidence from Brennan’s world or something, but I would want to update if he had a strong argument that identity has proven to be extremely important to all of human affairs since the dawn of civilization, which is entirely plausible.)
It seems odd to me that out of the 10^40 atoms in the solar system, there would not be any left to revive cryo patients. My impression is that FAI would revive cryo patients, with probability 80%, the remaining 20% being for very odd scenarios that I just can’t think of.
I guess I’m saying the amount of atoms it takes to revive a cryo patient is vastly more wasteful than its weight in computronium. You’re trading off one life for a huge amount of potential lives. A few people, like Alicorn if I understand her correctly, think that people who are already alive are worth a huge number of potential lives, but I don’t quite understand that intuition. Is this a point of disagreement for us?
Yeah, but the cryo patient could be run in software rather than in hardware, which would mean that it would be a rather insignificant amount of extra effort.
Gah, sorry, I keep leaving things out. I’m thinking about the actual physical finding out where cryo patients are, scanning their brains, repairing the damage, and then running them. Mike Blume had a good argument against this point: proportionally, the startup cost of scanning a brain is not much at all compared to the infinity of years of actually running the computation. This is where I should be doing the math… so I’m going to think about it more and try and figure things out. Another point is that an AGI could gain access to infinite computing power in finite time during which it could do everything, but I think I’m just confused about the nature of computations in a Tegmark multiverses here.
actual physical finding out where cryo patients are, scanning their brains, repairing the damage, and then running them.
I hadn’t thought of that; certainly if the AI’s mission was to run as many experience-moments as possible in the amount of space-time-energy it had, then it wouldn’t revive cryo patients.
Note that the same argument says that it would kill all existing persons rather than upload them, and re-use their mass and energy to run ems of generic happy people (maximizing experience moments without regard to any deontological constraints has some weird implications...)
Yes, but this makes people flustered so I prefer not to bring it up as a possibility. I’m not sure if it was Bostrom or just generic SIAI thinking where I heard that an FAI might deconstruct us in order to go out into the universe, solve the problem of astronomical waste, and then run computations of us (or in this case generic transhumans) far in the future.
Of course at this point, the terminology “Friendly” becomes misleading, and we should talk about a Goal-X-controlled-AGI, where Goal X is a variable for the goal that that AGI would optimize for.
There is no unique value for X. Some have suggested the output of CEV as the goal system, but if you look at CEV in detail, you see that it is jam-packed with parameters, all of which make a difference to the actual output.
I would personally lobby against the idea of an AGI that did crazy shit like killing existing people to save a few nanoseconds.
Hm, I’ve noticed before that the term ‘Friendly’ is sort of vague. What would I call an AI that optimizes strictly for my goals (and if I care about others’ goals, so be it)? A Will-AI? I’ve said a few times ‘your Friendly is not my Friendly’ but I think I was just redefining Friendliness in an incorrect way that Eliezer wouldn’t endorse.
What would I call an AI that optimizes strictly for my goals...A Will-AI?
One could say “Friendly towards Will.”
But the problem of nailing down your goals seems to me much harder than the problem of negotiating goals between different people. Thus I don’t see a problem of being vague about the target of Friendliness.
But the problem of nailing down your goals seems to me much harder than the problem of negotiating goals between different people. Thus I don’t see a problem of being vague about the target of Friendliness.
Agreed. And asking the question of what is preference of a specific person, represented in some formal language, seems to be a natural simplification of the problem statement, something that needs to be understood before the problem of preference aggregation can be approached.
but I think I was just redefining Friendliness in an incorrect way that Eliezer wouldn’t endorse.
Beware of the urge to censor thoughts that disagree with authority. I personally agree that there is a serious issue here—the issue of moral antirealism, which implies that there is no “canonical human notion of goodness”, so the terminology “Friendly AI” is actually somewhat misleading, and it might be better to say “average human extrapolated morality AGI” when that’s what we want to talk about, e.g.
“an average human extrapolated morality AGI would oppose a paperclip maximizer”.
Then it sounds less onerous to say that you disagree with what an average human extrapolated morality AGI would do than that you disagree with what a “Friendly AI” would do, because most people on this forum disagree with averaged-out human morality (for example, the average human is a theist). Contrast:
“What, you disagree with the FAI? Are you a bad guy then?”
“Friendly AI” is about as specific/ambiguous as “morality”—something humans mostly have in common, allowing for normal variation, not referring to details about specific people. As with preference (morality) of specific people, we can speak of FAI optimizing the world to preference of specific people. Naturally, for each given person it’s preferable to launch a personal-FAI to a consensus-FAI.
However, perhaps it can faithfully restore Will’s brain-state from recordings of Will in the minds of humanity anyway, if that’s what humanity would want. Alternatively Will is revived in ancestor simulations done by the FAI or any other FAI that is curious about humanity’s history around the time right before its singularity.
I am reasonably confident that no such process can produce an entity that I would identify as myself. Being reconstructed from other peoples’ memories means losing the memories of all inner thoughts, all times spent alone, and all times spent with people who have died or forgotten the occasion. That’s too much lost for any sort of continuity of consciousness.
Hm, well we can debate the magic powers a superintelligence possesses (whether or not it can raise the dead), but I think this would make Eliezer sad. I for one am not reasonably confident either way. I am not willing to put bounds on an entity that I am not sure won’t get access to an infinite amount of computation in finite time. At any rate, it seems we have different boundaries around identity. I’m having trouble removing the confusion about identity from my calculations.
We were talking about the probability of getting ‘saved’, and ‘saved’ to me requires that the future is suited such that I will upon reflection be thankful that I was revived instead of those resources being used for something else I would have liked to happen. In the vast majority of post-singularity worlds I do not think this will be the case. In fact, in the vast majority of post-singularity worlds, I think cryonics becomes plain irrelevant. And hence my sorta-extreme views on the subject.
I tried to make it clear in my post and when talking to both you and Vladimir Nesov that I prefer talking about ‘probability that I will get enough utility to justify cryonics upon reflection’ instead of ‘probability that cryonics will result in revival, independent of whether or not that will be considered a good thing upon reflection’. That’s why I put in the abnormally important footnote.
Oh, I see, my bad, apologies for the misunderstanding.
In which case, I ask: what is your probability that if you sign up for cryo now, you will be cryopreserved and revived (i.e. that your brain-state will be faithfully restored)? (This being something that you and I ought to agree on, and ought to be roughly the same replacing “Will” with “Roko”)
Cool, I’m glad to be talking about the same thing now! (I guess any sort of misunderstanding/argument causes me a decent amount of cognitive burden that I don’t realize was there until after it is removed. Maybe a fear of missing an important point that I will be embarrassed about having ignored upon reflection. I wonder if Steve Rayhawk experiences similar feelings on a normal basis?)
Well here’s a really simple, mostly qualitative analysis, with the hope that “Will” and “Roko” should be totally interchangeable.
Option 1: Will signs up for cryonics.
uFAI is developed before Will is cyopreserved. Signing up for cryonics doesn’t work, but this possibility has no significantness in our decision theory anyway.
uFAI is developed after Will is cryopreserved. Signing up for cryonics doesn’t work, but this possibility has no significantness in our decision theory anyway.
FAI is developed before Will is cryopreserved. Signing up for cryonics never gets a chance to work for Will specifically.
FAI is developed after Will is cryopreserved. Cryonics might work, depending on the implementation and results of things like CEV. This is a huge question mark for me. Something close to 50% is probably appropriate, but at times I have been known to say something closer to 5%, based on considerations like ‘An FAI is not going to waste resources reviving you: rather, it will spend resources on fulfilling what it expects your preferences probably were. If your preferences mandate you being alive, then it will do so, but I suspect that most humans upon much reflection and moral evolution won’t care as much about their specific existence.’ Anna Salamon and I think Eliezer suspect that personal identity is closer to human-ness than e.g. Steve Rayhawk and I do, for what it’s worth.
An existential risk occurs before Will is cryopreserved. Signing up for cryonics doesn’t work, but this possibility has no significantness in our decision theory anyway.
An existential risk occurs after Will is cryopreserved. Signing up for cryonics doesn’t work, but this possibility has no significantness in our decision theory anyway.
Option 2: Will does not sign up for cryonics.
uFAI is developed before Will dies. This situation is irrelevant to our decision theory.
uFAI is developed after Will dies. This situation is irrelevant to our decision theory.
FAI is developed before Will dies. This situation is irrelevant to our decision theory.
FAI is developed after Will dies. Because Will was not cryopreserved the FAI does not revive him in the typical sense. However, perhaps it can faithfully restore Will’s brain-state from recordings of Will in the minds of humanity anyway, if that’s what humanity would want. Alternatively Will is revived in ancestor simulations done by the FAI or any other FAI that is curious about humanity’s history around the time right before its singularity. Measure is really important here, so I’m confused. I suspect less but not orders of magnitude less than the 50% figure above? This is an important point.
An existential risk occurs and Will dies. This possibility has no significantness in our decision theory anyway.
An existential risk occurs and Will dies. This possibility has no significantness in our decision theory anyway.
Basically, the point is that the most important factor by far is what an FAI does after going FOOM, and we don’t really know what’s going to happen there. So cryonics becomes a matter of preference more than a matter of probability. But if you’re thinking about worlds that our decision theory discounts, e.g. where a uFAI is developed or rogue MNT is developed, then the probability of being revived drops a lot.
You could still actually give a probability that you’ll get revived. Yes, I agree that knowing what the outcome of AGI is is extremely important, but you should still just have a probability for that.
Well, that gets tricky, because I have weak subjective evidence that I can’t share with anyone else, and really odd ideas about it, that makes me think that an FAI is the likely outcome. (Basically, I suspect something sorta kinda a little along the lines of me living in a fun theory universe. Or more precisely, I am a sub-computation of a longer computation that is optimized for fun, so that even though my life is sub-optimal at the moment I expect it to get a lot better in the future, and that the average of the whole computation’s fun will turned out to be argmaxed. Any my life right now rocks pretty hard anyway. I suspect other people have weaker versions of this [with different evidence from mine] with correspondingly weaker probability estimates for this kind of thing happening.) So if we assume with p=1 that a positive singularity will occur for sake of ease, that leaves about 2% that cryonics will work (5% that an FAI raises the cryonic dead minus 3% that an FAI raises all the dead) if you die times the probability that you die before the singularity (about 15% for most people [but about 2% for me]) which leads to 0.3% as my figure for someone with a sense of identity far stronger than me, Kaj, and many others, who would adjust downward from there (an FAI can be expected to extrapolate our minds and discover it should use the resources on making 10 people with values similar to ourself instead, or something). If you say something like 5% positive singularity instead, then it comes out to 0.015%, or very roughly 1 in 7000 (although of course your decision theory should discount worlds in which you die no matter what anyway, so that the probability of actually living past the singularity shouldn’t change your decision to sign up all that much). I suspect someone with different intuitions would give a very different answer, but it’ll be hard to make headway in debate because it really is so non-technical. The reason I give extremely low probabilities for myself is due to considerations that apply to me only and that I’d rather not go into.
Hmm… Seems like crazy talk to me. It’s your mind, tread softly.
The ideas about fun theory are crazy talk indeed, but they’re sort of tangential to my main points. I have much crazier ideas peppered throughout the comments of this post (very silly implications of decision theory in a level 4 multiverse that are almost assuredly wrong but interesting intuition pumps) and even crazier ideas in the notes I write to myself. Are you worried that this will lead to some sort of mental health danger, or what? I don’t know how often high shock levels damage one’s sanity to an appreciable degree.
It’s not “shock levels” which are a problem, it’s working in the “almost assuredly wrong” mode. If you yourself believe ideas you develop to be wrong, are they knowledge, are they progress? Do crackpots have “damaged sanity”?
It’s usually better to develop ideas on as firm ground as possible, working towards the unknown from statements you can rely on. Even in this mode will you often fail, but you’d be able to make gradual progress that won’t be illusory. Not all questions are ready to be answered (or even asked).
98% certain that the singularity will happen before you die (which could easily be 2070)? This seems like an unjustifiably high level of confidence.
For what it’s worth the uncertain future application gives me 99% chance of a singularity before 2070 if I recall correctly. The mean of my distrubution is 2028.
I really wish more SIAI members talked to each other about this! Estimates vary wildly, and I’m never sure if people are giving estimates taking into account their decision theory or not (that is, thinking ‘We couldn’t prevent a negative singularity if it was to occur in the next 10 years, so let’s discount those worlds and exclude them from our probability estimates’.) I’m also not sure if people are giving far-off estimates because they don’t want to think about the implications otherwise, or because they tried to build an FAI and it didn’t work, or because they want to signal sophistication and sophisticated people don’t predict crazy things happening very soon, or because they are taking an outside view of the problem, or because they’ve read the recent publications at the AGI conferences and various journals, thought about advances that need to be made, estimated the rate of progress, and determined a date using the inside view (like Steve Rayhawk who gives a shorter time estimate than anyone else, or Shane Legg who I’ve heard also gives a short estimate but I am not sure about that, or Ben Goertzel who I am again not entirely sure about, or Juergen Schmidhuber who seems to be predicting it soonish, or Eliezer who used to have a soonish estimate with very wide tails but I have no idea what his thoughts are now). I’ve heard the guys at FHI also have distant estimates, and a lot of narrow AI people predict far-off AGI as well. Where are the ‘singularity is far’ people getting their predictions?
UF is not accurate!
True. But the mean of my distribution is still 2028 regardless of the inaccuracy of UF.
The problem with the uncertain future is that it is a model of reality which allows you to play with the parameters of the model, but not the structure. For example, it has no option for “model uncertainty”, e.g. the possibility that the assumptions it makes about forms of probability distributions are incorrect. And a lot of these assumptions were made for the sake of tractability rather than realism. I think that the best way to use it is as an intuition pump for your own model, which you could make in excel or in your head.
Giving probabilities of 99% is a classic symptom of not having any model uncertainty.
If Nick and I write some more posts I think this would be the theme. Structural uncertainty is hard to think around.
Anyway, I got my singularity estimations by listening to lots of people working at SIAI and seeing whose points I found compelling. When I arrived at Benton I was thinking something like 2055. It’s a little unsettling that the more arguments I hear from both sides the nearer in the future my predictions are. I think my estimates are probably too biased towards Steve Rayhawk’s, but this is because everyone else’s estimates seem to take the form of outside view considerations that I find weak.
This seems to rely on your idea that, on reflection, humans probably don’t care about themselves, i.e. if I reflected sufficiently hard, I would place zero terminal value on my own life.
I wonder how you’re so confident about this? Like, 95% confident that all humans would place zero terminal value their own lives?
Note also that it is possible that some but not all people would, on reflection, place zero value on their own lives.
Not even close to zero, but less terminal value than you would assign to other things that an FAI could optimize for. I’m not sure how much extrapolated unity of mankind there would be on this regard. I suspect Eliezer or Anna would counter my 5% with a 95%, and I would Aumann to some extent, but I was giving my impression and not belief. (I think that this is better practice at the start of a ‘debate’: otherwise you might update on the wrong expected evidence. EDIT: To be more clear, I wouldn’t want to update on Eliezer’s evidence if it was some sort of generalization from fictional evidence from Brennan’s world or something, but I would want to update if he had a strong argument that identity has proven to be extremely important to all of human affairs since the dawn of civilization, which is entirely plausible.)
It seems odd to me that out of the 10^40 atoms in the solar system, there would not be any left to revive cryo patients. My impression is that FAI would revive cryo patients, with probability 80%, the remaining 20% being for very odd scenarios that I just can’t think of.
I guess I’m saying the amount of atoms it takes to revive a cryo patient is vastly more wasteful than its weight in computronium. You’re trading off one life for a huge amount of potential lives. A few people, like Alicorn if I understand her correctly, think that people who are already alive are worth a huge number of potential lives, but I don’t quite understand that intuition. Is this a point of disagreement for us?
Yeah, but the cryo patient could be run in software rather than in hardware, which would mean that it would be a rather insignificant amount of extra effort.
Gah, sorry, I keep leaving things out. I’m thinking about the actual physical finding out where cryo patients are, scanning their brains, repairing the damage, and then running them. Mike Blume had a good argument against this point: proportionally, the startup cost of scanning a brain is not much at all compared to the infinity of years of actually running the computation. This is where I should be doing the math… so I’m going to think about it more and try and figure things out. Another point is that an AGI could gain access to infinite computing power in finite time during which it could do everything, but I think I’m just confused about the nature of computations in a Tegmark multiverses here.
I hadn’t thought of that; certainly if the AI’s mission was to run as many experience-moments as possible in the amount of space-time-energy it had, then it wouldn’t revive cryo patients.
Note that the same argument says that it would kill all existing persons rather than upload them, and re-use their mass and energy to run ems of generic happy people (maximizing experience moments without regard to any deontological constraints has some weird implications...)
Yes, but this makes people flustered so I prefer not to bring it up as a possibility. I’m not sure if it was Bostrom or just generic SIAI thinking where I heard that an FAI might deconstruct us in order to go out into the universe, solve the problem of astronomical waste, and then run computations of us (or in this case generic transhumans) far in the future.
Of course at this point, the terminology “Friendly” becomes misleading, and we should talk about a Goal-X-controlled-AGI, where Goal X is a variable for the goal that that AGI would optimize for.
There is no unique value for X. Some have suggested the output of CEV as the goal system, but if you look at CEV in detail, you see that it is jam-packed with parameters, all of which make a difference to the actual output.
I would personally lobby against the idea of an AGI that did crazy shit like killing existing people to save a few nanoseconds.
Hm, I’ve noticed before that the term ‘Friendly’ is sort of vague. What would I call an AI that optimizes strictly for my goals (and if I care about others’ goals, so be it)? A Will-AI? I’ve said a few times ‘your Friendly is not my Friendly’ but I think I was just redefining Friendliness in an incorrect way that Eliezer wouldn’t endorse.
One could say “Friendly towards Will.”
But the problem of nailing down your goals seems to me much harder than the problem of negotiating goals between different people. Thus I don’t see a problem of being vague about the target of Friendliness.
Agreed. And asking the question of what is preference of a specific person, represented in some formal language, seems to be a natural simplification of the problem statement, something that needs to be understood before the problem of preference aggregation can be approached.
Beware of the urge to censor thoughts that disagree with authority. I personally agree that there is a serious issue here—the issue of moral antirealism, which implies that there is no “canonical human notion of goodness”, so the terminology “Friendly AI” is actually somewhat misleading, and it might be better to say “average human extrapolated morality AGI” when that’s what we want to talk about, e.g.
Then it sounds less onerous to say that you disagree with what an average human extrapolated morality AGI would do than that you disagree with what a “Friendly AI” would do, because most people on this forum disagree with averaged-out human morality (for example, the average human is a theist). Contrast:
“Friendly AI” is about as specific/ambiguous as “morality”—something humans mostly have in common, allowing for normal variation, not referring to details about specific people. As with preference (morality) of specific people, we can speak of FAI optimizing the world to preference of specific people. Naturally, for each given person it’s preferable to launch a personal-FAI to a consensus-FAI.
I am reasonably confident that no such process can produce an entity that I would identify as myself. Being reconstructed from other peoples’ memories means losing the memories of all inner thoughts, all times spent alone, and all times spent with people who have died or forgotten the occasion. That’s too much lost for any sort of continuity of consciousness.
Hm, well we can debate the magic powers a superintelligence possesses (whether or not it can raise the dead), but I think this would make Eliezer sad. I for one am not reasonably confident either way. I am not willing to put bounds on an entity that I am not sure won’t get access to an infinite amount of computation in finite time. At any rate, it seems we have different boundaries around identity. I’m having trouble removing the confusion about identity from my calculations.
You suspect that most people, upon reflection, won’t care whether they live or die? I’m intrigued: what makes you think this?