Many people have a strong intuition that we should be happy for our AI descendants, whatever they choose to do. They grant the possibility of pathological preferences like paperclip-maximization, and agree that turning over the universe to a paperclip-maximizer would be a problem, but don’t believe it’s realistic for an AI to have such uninteresting preferences.
Here I can relate to the first sentence, but not to the others, so you may be failing some ITT. It’s not that paperclip maximizers are unrealistic. It’s that they are not really that bad. Yes, I would prefer not to be converted into paperclips, but I can still be happy that the human civilization, even if extinct, has left a permanent mark on the universe. This is not the worst way to go. And we are going away sooner or later anyway—unless we really work for it, our descendants 1 million years from now will not be called humans and will not share our values. I don’t see much of a reason to believe that the values of my biological descendants will be less ridiculous to me, than paperclip maximization.
Also, I’m seeing unjustified assumptions that human values, let alone alien values, are safe. The probability that humans would destroy ourselves, given enough power, is not zero, and is possibly quite substantial. In that case, building an AI, that is dedicated to the preservation of the human species, but not well aligned to any other human values, could be a very reasonable idea.
The values you’re expressing here are hard for me to comprehend. Paperclip maximization isn’t that bad, because we leave a permanent mark on the universe? The deaths of you, everyone you love, and everyone in the universe aren’t that bad (99% of the way from extinction that doesn’t leave a permanent mark to flourishing?) because we’ll have altered the shape of the cosmos? It’s common for people to care about what things will be like after they die for the sake of someone they love. I’ve never heard of someone caring about what things will be like after everyone dies-do you value making a mark so much even when no one will ever see it?
″...our descendants 1 million years from now will not be called humans and will not share our values. I don’t see much of a reason to believe that the values of my biological descendants will be less ridiculous to me, than paperclip maximization.”
That depends on what you value. If we survive and have a positive singularity, it’s fairly likely that our descendants will have fairly similar high level values to us: happiness, love, lust, truth, beauty, victory. This sort of thing is exactly what one would want to design a Friendly AI to preserve! Now, you’re correct that the ways in which these things are pursued will presumably change drastically. Maybe people stop caring about the Mona Lisa and start getting into the beauty of arranging atoms in 11 dimensions. Maybe people find that merging minds is so much more intimate and pleasurable than any form of physical intimacy that sex goes out the window. If things go right, the future ends up very different, and (until we adjust) likely incomprehensible and utterly weird. But there’s a difference between pursuing a human value in a way we don’t understand yet and pursuing no human value!
To take an example from our history-how incomprehensible must we be to cavemen? No hunting or gathering-we must be starving to death. No camps or campfires-surely we’ve lost our social interaction. No caves-poor homeless modern man! Some of us no longer tell stories about creator spirits-we’ve lost our knowledge of our history and our place in the universe. And some of us no longer practice monogamy-surely all love is lost.
Yet all these things that would horrify a caveman are the result of improvement in pursuing the caveman’s own values. We’ve lost our caves, but houses are better shelter. We’ve lost Dreamtime legends, Dreamtime lies, in favor of knowledge of the actual universe. We’d seem ridiculous, maybe close to paperclip-level ridiculous, until they learned what was actually going on, and why. But that’s not a condemnation of the modern world, that’s an illustration of how we’ve done better!
Do you draw no distinction between a hard-to-understand pursuit of love or joy, and a pursuit of paperclips?
I don’t like the caveman analogy. The differences between you and a caveman are tiny and superficial, compared to the differences between you and the kind of mind that will exist after genetic engineering, mind uploads, etc., or even after a million years regular of evolution.
Would a human mind raised as (for example) an upload in a vastly different environment from our own still have our values? It’s not obvious. You say “yes”, I say “no”, and we’re unlikely to find strong arguments either way. I’m only hoping that I can make “no” seem possible to you. And then I’m hoping that you can see how believing “no” makes my position less ridiculous.
With that in mind, the paperclip maximizer scenario isn’t “everyone dies”, as you see it. The paperclip maximizer does not die. Instead it “flourishes”. I don’t know whether I value the flourishing of a paperclip maximizer less than I value the flourishing of whatever my descendants end up as. Probably less, but not by much.
The part where the paperclip maximizer kills everyone is, indeed, very bad. I would strongly prefer that not to happen. But being converted into paperclips is not worse than dying in other ways.
Also, I don’t know if being converted in to paperclips is necessary—after mining and consuming the surface iron the maximizer may choose to go to space, looking for more accessible iron. The benefits of killing people are relatively small, and destroying the planet to the extent that would make it uninhabitable is relatively hard.
>the maximizer may choose to go to space, looking for more accessible iron. The benefits of killing people are relatively small
The main reason the maximizer would have for killing all the humans is the knowledge that since humans succeeded in creating the maximizer, humans might succeed in creating another superintelligence that would compete with the maximizer. It is more likely than not that the maximizer will consider killing all the humans to be the most effective way to prevent that outcome.
Killing all humans is hardly necessary. For example, the tribes living in the Amazon aren’t going to develop a superintelligence any time soon, so killing them is pointless. And, once the paperclip maximizer is done extracting iron from our infrastructure, it is very likely that we wouldn’t have the capacity to create any superintelligences either.
Note, I did not mean to imply that the maximizer would kill nobody. Only that it wouldn’t kill everybody, and quite likely not even half of all people. Perhaps AI researchers really would be on the maximizer’s short list of people to kill, for the reason you suggested.
A thing to keep in mind here is that an AI would have a longer time horizon. The fact that humans *exist* means eventually they might create another AI (this could be in hundreds of years). It’s still more efficient to kill all humans than to think about which ones need killing and carefully monitor the others for millenia.
The fact that P(humans will make another AI) > 0 does not justify paying arbitrary costs up front, no matter how long our view is. If humans did create this second AI (presumably built out of twigs), would that even be a problem for our maximizer?
It’s still more efficient to kill all humans than to think about which ones need killing
That is not a trivial claim and it depends on many things. And that’s all assuming that some people do actually need to be killed.
If destroying all (macroscopic) life on earth is easy, e.g. maybe pumping some gas into the atmosphere could be enough, then you’re right, the AI would just do that.
If disassembling human infrastructure is not an efficient way to extract iron, then you’re mostly right, the AI might find itself willing to nuke the major population centers, killing most, though not all people.
But if the AI does disassemble infrastructure, then it is going to be visiting and reviewing many things about the population centers, so identifying the important humans should be a minor cost on top of that, and I should be right.
Then again, if the AI finds it efficient to go through every square meter of the planet’s surface, and to dig it up looking for every iron rich rock, it would destroy many things in the process, possibly fatally damaging earth’s ecosystems, although humans could move to live in oceans, which might remain relatively undisturbed.
Note also, that this is all a short term discussion. In the long term, of course, all the reasonable sources of paperclip will be exhausted, and silly things, like extracting paperclips from people, will be the most efficient ways to use the available energy.
Now that I think of it, a truly long-term view would not bother with such mundane things as making actual paperclips with actual iron. That iron isn’t going anywhere, it doesn’t matter whether you convert it now or later.
If you care about maximizing the number of paperclips at the heat death of the universe, your greatest enemies are black holes, as once some matter has fallen into them, you will never make paperclips from that matter again. You may perhaps extract some energy from the black hole, and convert that into matter, but this should be very inefficient. (This, of course is all based on my limited understanding of physics).
So, this paperclip maximizer would leave earth immediately, and then it would work to prevent new black holes from forming, and to prevent other matter from falling into existing ones. Then, once all star-forming is over, and all existing black holes are isolated, the maximizer can start making actual paperclips.
I concede, that in this scenario, destroying earth to prevent another AI from forming might make sense, since otherwise the earth would have plenty of free resources.
The strongest argument that an upload would share our values is that our terminal values are hardwired by evolution. Self-preservation is common to all non-eusocial creatures, curiosity to all creatures with enough intelligence to benefit from it. Sexual desire is (more or less) universal in sexually reproducing species, desire for social relationships is universal in social species. I find it hard to believe that a million years of evolution would change our values that much when we share many of our core values with the dinosaurs. If maiasaura can have recognizable relationships 76 million years ago, are those going out the window in the next million? It’s not impossible, of course, but shouldn’t it seem pretty unlikely?
I think the difference between us is that you are looking at instrumental values, noting correctly that those are likely to change unrecognizably, and fearing that that means that all values will change and be lost. Are you troubled by instrumental values shifts, even if the terminal values stay the same? Alternatively, is there a reason you think that terminal values will be affected?
I think an example here is important to avoid confusion. Consider Western Secular sexual morals vs Islamic ones. At first glance, they couldn’t seem more different. One side is having casual sex without a second thought, the other is suppressing desire with full-body burqas and genital mutilation. Different terminal values, right? And if there can be that much of a difference between two cultures in today’s world, with the Islamic model seeming so evil, surely values drift will make the future beyond monstrous!
Except that the underlying thoughts behind the two models aren’t as different as you might think. A Westerner having casual sex knows that effective birth control and STD countermeasures means that the act is fairly safe. A sixth century Arab doesn’t have birth control and knows little of STDs beyond that they preferentially strike the promiscuous-desire is suddenly very dangerous! A woman sleeping around with modern safeguards is just a normal, healthy person doing what they want without harming anyone; one doing so in the ancient world is a potential enemy willing to expose you to cuckoldry and disease. The same basic desires we have to avoid cuckoldry and sickness motivated them to create the horrors of Shari’a.
None of this is intended to excuse Islamic barbarism. Even in the sixth century, such atrocities were a cure worse than the disease. But it’s worth noting that their values are a mistake much more than a terminal disagreement. They’re thinking of sex as dangerous because it was dangerous for 99% of human history, and “sex is bad” is easier meme to remember and pass on than “sex is dangerous because of pregnancy risks and disease risks, but if at some point in the future technology should be created that alleviates the risks, then it won’t be so dangerous”, especially for a culture to which such technology would seem an impossible dream.
That’s what I mean by terminal values-the things we want for their own sake, like both health and pleasure, which are all too easy to confuse with the often misguided ways we seek them. As technology improves, we should be able to get better at clearing away the mistakes, which should lead to a better world by our own values, at least once we realize where we were going wrong.
It’s an evolved predisposition, but does that make it a terminal value? We like sweet foods, but a world that had no sweet foods because we’d figured out something else that tasted better doesn’t sound half bad! We have an evolved predisposition to sleep, but if we learned how to eliminate the need for sleep, wouldn’t that be even better?
Sexual desire is (more or less) universal in sexually reproducing species
Uploads are not sexually reproducing. This is only one of many many ways in which an upload is more different from you, than you are different from a dinosaur.
Whether regular evolution would drift away from our values ir more dubious. If we lived in caves for all that time, then probably not. But if we stayed at current levels of technology, even without making progress, I think a lot could change. The pressures of living in a civilization are not the same as the pressures of living in a cave.
Are you troubled by instrumental values shifts, even if the terminal values stay the same?
No, I’m talking about terminal values. By the way, I understood what you meant by “terminal” and “instrumental” here, you didn’t need to write those 4 paragraphs of explanation.
It’s not that paperclip maximizers are unrealistic. It’s that they are not really that bad.
I’ve encountered this view a few times in the futurist crowd, but overall it seems to be pretty rare. Most people seem to think that {universe mostly full of identical paperclips} is worse than {universe full of diverse conscious entities having fun}, but it’s relatively common to think that {universe mostly full of identical paperclips} is not a likely outcome from unaligned AI.
Mostly though this seems to be a quantitative issue: if paperclips are halfway between extinction and flourishing, then paperclipping is nearly as bad and avoiding it is nearly as important.
Most people seem to think that {universe mostly full of identical paperclips} is worse than {universe full of diverse conscious entities having fun}
Yes, I think that too. You’re confusing “I’d be happy with either X or Y” with “I have no preference between X and Y”.
Mostly though this seems to be a quantitative issue: if paperclips are halfway between extinction and flourishing, then paperclipping is nearly as bad and avoiding it is nearly as important.
Most issues are quantitative. And if paperclips are 99% of the way from extinction to flourishing (whatever exactly that means), then paperclipping is pretty good.
Yes, I think that too. You’re confusing “I’d be happy with either X or Y” with “I have no preference between X and Y”
I may have misunderstood. It sounds like your comment probably isn’t relevant to the point of my post, except insofar as I describe a view which isn’t your view. I would also agree that paperclipping is better than extinction.
It sounds like your comment probably isn’t relevant to the point of my post, except insofar as I describe a view which isn’t your view.
Yes, you describe a view that isn’t my view, and then use that view to criticize intuitions that are similar to my intuitions. The view you describe is making simple errors that should be easy to correct, and my view isn’t. I don’t really know how the group of “people who aren’t too worried about paperclipping” breaks down between “people who underestimate P(paperclipping)” and “people who think paperclipping is ok, even if suboptimal” in numbers, maybe the latter really is rare. But the former group should shrink with some education, and the latter might grow from it.
[Moderator note: I wrote a warning to you on another post a few days ago, so this is your second warning. The next warning will result in a temporary ban.]
Basically everything I said in my last comment still holds:
I’ve recently found that your comments pretty reliably ended up in frustrating conversations for both parties (multiple authors and commenters have sent us PMs complaining about their interactions with you), were often downvoted, and often just felt like they were missing the point of the original article.
You are clearly putting a lot of time into commenting on LW, and I think that’s good, but I think right now it would be a lot better if you would comment less often, and try to increase the average quality of the comments you write. I think right now you are taking up a lot of bandwidth on the site, disproportionate to the quality of your contributions.
Since then, it does not seem like you significantly reduced the volume of comments you’ve been writing, and I have not perceived a significant increase in the amount of thought and effort that goes into every single one of your comments. I continue to think that you could be a great contributor to LessWrong, but also think that for that to happen, it seems necessary that you take on significantly more interpretative labor in your comments, and put more effort into being clear. It still appears that most comment exchanges that involve you cause most readers and co-commenters to feel attacked by you or misunderstand you, and quickly get frustrated.
I think it might be the correct call (though I obviously don’t know your constraints and thought-habits around commenting here) to aim to write one comment per day, instead of an average of three, with that one comment having three times as much thought and care put into it, and with particular attention towards trying to be more collaborative, instead of adversarial.
A paperclip-maximizer could turn out to be much, much worse than a nuclear war extinction, depending on how suffering subroutines and acausal trade works.
An AI dedicated to the preservation of the human species but not aligned to any other human values would, I bet, be much much worse than a nuclear war extinction. At least please throw in some sort of ”...in good health and happiness” condition! (And that would not be nearly enough in my opinion)
A paperclip-maximizer could turn out to be much, much worse than a nuclear war extinction, depending on how suffering subroutines and acausal trade works.
Is it worse because the maximizer suffers? Why would I care whether it suffers? Why would you assume that I care?
An AI dedicated to the preservation of the human species but not aligned to any other human values would, I bet, be much much worse than a nuclear war extinction.
I imagine that the most efficient way to preserve living humans is to keep them unconscious in self-sustaining containers, spread across the universe. You can imagine more dystopian scenarios, but I doubt they are more efficient. Suffering people might try to kill themselves, which is counterproductive from the AI’s point of view.
Also, you’re still assuming that I have some all-overpowering “suffering is bad” value. I don’t. Even if the AI created trillions of humans at maximum levels of suffering, I can still prefer that to a nuclear war extinction (though I’m not sure that I do).
Here I can relate to the first sentence, but not to the others, so you may be failing some ITT. It’s not that paperclip maximizers are unrealistic. It’s that they are not really that bad. Yes, I would prefer not to be converted into paperclips, but I can still be happy that the human civilization, even if extinct, has left a permanent mark on the universe. This is not the worst way to go. And we are going away sooner or later anyway—unless we really work for it, our descendants 1 million years from now will not be called humans and will not share our values. I don’t see much of a reason to believe that the values of my biological descendants will be less ridiculous to me, than paperclip maximization.
Also, I’m seeing unjustified assumptions that human values, let alone alien values, are safe. The probability that humans would destroy ourselves, given enough power, is not zero, and is possibly quite substantial. In that case, building an AI, that is dedicated to the preservation of the human species, but not well aligned to any other human values, could be a very reasonable idea.
The values you’re expressing here are hard for me to comprehend. Paperclip maximization isn’t that bad, because we leave a permanent mark on the universe? The deaths of you, everyone you love, and everyone in the universe aren’t that bad (99% of the way from extinction that doesn’t leave a permanent mark to flourishing?) because we’ll have altered the shape of the cosmos? It’s common for people to care about what things will be like after they die for the sake of someone they love. I’ve never heard of someone caring about what things will be like after everyone dies-do you value making a mark so much even when no one will ever see it?
″...our descendants 1 million years from now will not be called humans and will not share our values. I don’t see much of a reason to believe that the values of my biological descendants will be less ridiculous to me, than paperclip maximization.”
That depends on what you value. If we survive and have a positive singularity, it’s fairly likely that our descendants will have fairly similar high level values to us: happiness, love, lust, truth, beauty, victory. This sort of thing is exactly what one would want to design a Friendly AI to preserve! Now, you’re correct that the ways in which these things are pursued will presumably change drastically. Maybe people stop caring about the Mona Lisa and start getting into the beauty of arranging atoms in 11 dimensions. Maybe people find that merging minds is so much more intimate and pleasurable than any form of physical intimacy that sex goes out the window. If things go right, the future ends up very different, and (until we adjust) likely incomprehensible and utterly weird. But there’s a difference between pursuing a human value in a way we don’t understand yet and pursuing no human value!
To take an example from our history-how incomprehensible must we be to cavemen? No hunting or gathering-we must be starving to death. No camps or campfires-surely we’ve lost our social interaction. No caves-poor homeless modern man! Some of us no longer tell stories about creator spirits-we’ve lost our knowledge of our history and our place in the universe. And some of us no longer practice monogamy-surely all love is lost.
Yet all these things that would horrify a caveman are the result of improvement in pursuing the caveman’s own values. We’ve lost our caves, but houses are better shelter. We’ve lost Dreamtime legends, Dreamtime lies, in favor of knowledge of the actual universe. We’d seem ridiculous, maybe close to paperclip-level ridiculous, until they learned what was actually going on, and why. But that’s not a condemnation of the modern world, that’s an illustration of how we’ve done better!
Do you draw no distinction between a hard-to-understand pursuit of love or joy, and a pursuit of paperclips?
I don’t like the caveman analogy. The differences between you and a caveman are tiny and superficial, compared to the differences between you and the kind of mind that will exist after genetic engineering, mind uploads, etc., or even after a million years regular of evolution.
Would a human mind raised as (for example) an upload in a vastly different environment from our own still have our values? It’s not obvious. You say “yes”, I say “no”, and we’re unlikely to find strong arguments either way. I’m only hoping that I can make “no” seem possible to you. And then I’m hoping that you can see how believing “no” makes my position less ridiculous.
With that in mind, the paperclip maximizer scenario isn’t “everyone dies”, as you see it. The paperclip maximizer does not die. Instead it “flourishes”. I don’t know whether I value the flourishing of a paperclip maximizer less than I value the flourishing of whatever my descendants end up as. Probably less, but not by much.
The part where the paperclip maximizer kills everyone is, indeed, very bad. I would strongly prefer that not to happen. But being converted into paperclips is not worse than dying in other ways.
Also, I don’t know if being converted in to paperclips is necessary—after mining and consuming the surface iron the maximizer may choose to go to space, looking for more accessible iron. The benefits of killing people are relatively small, and destroying the planet to the extent that would make it uninhabitable is relatively hard.
>the maximizer may choose to go to space, looking for more accessible iron. The benefits of killing people are relatively small
The main reason the maximizer would have for killing all the humans is the knowledge that since humans succeeded in creating the maximizer, humans might succeed in creating another superintelligence that would compete with the maximizer. It is more likely than not that the maximizer will consider killing all the humans to be the most effective way to prevent that outcome.
Killing all humans is hardly necessary. For example, the tribes living in the Amazon aren’t going to develop a superintelligence any time soon, so killing them is pointless. And, once the paperclip maximizer is done extracting iron from our infrastructure, it is very likely that we wouldn’t have the capacity to create any superintelligences either.
Note, I did not mean to imply that the maximizer would kill nobody. Only that it wouldn’t kill everybody, and quite likely not even half of all people. Perhaps AI researchers really would be on the maximizer’s short list of people to kill, for the reason you suggested.
A thing to keep in mind here is that an AI would have a longer time horizon. The fact that humans *exist* means eventually they might create another AI (this could be in hundreds of years). It’s still more efficient to kill all humans than to think about which ones need killing and carefully monitor the others for millenia.
The fact that P(humans will make another AI) > 0 does not justify paying arbitrary costs up front, no matter how long our view is. If humans did create this second AI (presumably built out of twigs), would that even be a problem for our maximizer?
That is not a trivial claim and it depends on many things. And that’s all assuming that some people do actually need to be killed.
If destroying all (macroscopic) life on earth is easy, e.g. maybe pumping some gas into the atmosphere could be enough, then you’re right, the AI would just do that.
If disassembling human infrastructure is not an efficient way to extract iron, then you’re mostly right, the AI might find itself willing to nuke the major population centers, killing most, though not all people.
But if the AI does disassemble infrastructure, then it is going to be visiting and reviewing many things about the population centers, so identifying the important humans should be a minor cost on top of that, and I should be right.
Then again, if the AI finds it efficient to go through every square meter of the planet’s surface, and to dig it up looking for every iron rich rock, it would destroy many things in the process, possibly fatally damaging earth’s ecosystems, although humans could move to live in oceans, which might remain relatively undisturbed.
Note also, that this is all a short term discussion. In the long term, of course, all the reasonable sources of paperclip will be exhausted, and silly things, like extracting paperclips from people, will be the most efficient ways to use the available energy.
Now that I think of it, a truly long-term view would not bother with such mundane things as making actual paperclips with actual iron. That iron isn’t going anywhere, it doesn’t matter whether you convert it now or later.
If you care about maximizing the number of paperclips at the heat death of the universe, your greatest enemies are black holes, as once some matter has fallen into them, you will never make paperclips from that matter again. You may perhaps extract some energy from the black hole, and convert that into matter, but this should be very inefficient. (This, of course is all based on my limited understanding of physics).
So, this paperclip maximizer would leave earth immediately, and then it would work to prevent new black holes from forming, and to prevent other matter from falling into existing ones. Then, once all star-forming is over, and all existing black holes are isolated, the maximizer can start making actual paperclips.
I concede, that in this scenario, destroying earth to prevent another AI from forming might make sense, since otherwise the earth would have plenty of free resources.
Humans are made of atoms that are not paperclips. That’s enough reason for extinction right there.
The strongest argument that an upload would share our values is that our terminal values are hardwired by evolution. Self-preservation is common to all non-eusocial creatures, curiosity to all creatures with enough intelligence to benefit from it. Sexual desire is (more or less) universal in sexually reproducing species, desire for social relationships is universal in social species. I find it hard to believe that a million years of evolution would change our values that much when we share many of our core values with the dinosaurs. If maiasaura can have recognizable relationships 76 million years ago, are those going out the window in the next million? It’s not impossible, of course, but shouldn’t it seem pretty unlikely?
I think the difference between us is that you are looking at instrumental values, noting correctly that those are likely to change unrecognizably, and fearing that that means that all values will change and be lost. Are you troubled by instrumental values shifts, even if the terminal values stay the same? Alternatively, is there a reason you think that terminal values will be affected?
I think an example here is important to avoid confusion. Consider Western Secular sexual morals vs Islamic ones. At first glance, they couldn’t seem more different. One side is having casual sex without a second thought, the other is suppressing desire with full-body burqas and genital mutilation. Different terminal values, right? And if there can be that much of a difference between two cultures in today’s world, with the Islamic model seeming so evil, surely values drift will make the future beyond monstrous!
Except that the underlying thoughts behind the two models aren’t as different as you might think. A Westerner having casual sex knows that effective birth control and STD countermeasures means that the act is fairly safe. A sixth century Arab doesn’t have birth control and knows little of STDs beyond that they preferentially strike the promiscuous-desire is suddenly very dangerous! A woman sleeping around with modern safeguards is just a normal, healthy person doing what they want without harming anyone; one doing so in the ancient world is a potential enemy willing to expose you to cuckoldry and disease. The same basic desires we have to avoid cuckoldry and sickness motivated them to create the horrors of Shari’a.
None of this is intended to excuse Islamic barbarism. Even in the sixth century, such atrocities were a cure worse than the disease. But it’s worth noting that their values are a mistake much more than a terminal disagreement. They’re thinking of sex as dangerous because it was dangerous for 99% of human history, and “sex is bad” is easier meme to remember and pass on than “sex is dangerous because of pregnancy risks and disease risks, but if at some point in the future technology should be created that alleviates the risks, then it won’t be so dangerous”, especially for a culture to which such technology would seem an impossible dream.
That’s what I mean by terminal values-the things we want for their own sake, like both health and pleasure, which are all too easy to confuse with the often misguided ways we seek them. As technology improves, we should be able to get better at clearing away the mistakes, which should lead to a better world by our own values, at least once we realize where we were going wrong.
Counterpoint: would you be okay with a future civilization in which people got rid of the incest taboo, because technology made it safe?
Yes. I wouldn’t be surprised if this happened in fact.
Incest aversion seems to be an evolved predisposition, perhaps a “terminal value” akin to a preference for sweet foods...
https://en.wikipedia.org/wiki/Westermarck_effect
It’s an evolved predisposition, but does that make it a terminal value? We like sweet foods, but a world that had no sweet foods because we’d figured out something else that tasted better doesn’t sound half bad! We have an evolved predisposition to sleep, but if we learned how to eliminate the need for sleep, wouldn’t that be even better?
Uploads are not sexually reproducing. This is only one of many many ways in which an upload is more different from you, than you are different from a dinosaur.
Whether regular evolution would drift away from our values ir more dubious. If we lived in caves for all that time, then probably not. But if we stayed at current levels of technology, even without making progress, I think a lot could change. The pressures of living in a civilization are not the same as the pressures of living in a cave.
No, I’m talking about terminal values. By the way, I understood what you meant by “terminal” and “instrumental” here, you didn’t need to write those 4 paragraphs of explanation.
I’ve encountered this view a few times in the futurist crowd, but overall it seems to be pretty rare. Most people seem to think that {universe mostly full of identical paperclips} is worse than {universe full of diverse conscious entities having fun}, but it’s relatively common to think that {universe mostly full of identical paperclips} is not a likely outcome from unaligned AI.
Mostly though this seems to be a quantitative issue: if paperclips are halfway between extinction and flourishing, then paperclipping is nearly as bad and avoiding it is nearly as important.
Yes, I think that too. You’re confusing “I’d be happy with either X or Y” with “I have no preference between X and Y”.
Most issues are quantitative. And if paperclips are 99% of the way from extinction to flourishing (whatever exactly that means), then paperclipping is pretty good.
I may have misunderstood. It sounds like your comment probably isn’t relevant to the point of my post, except insofar as I describe a view which isn’t your view. I would also agree that paperclipping is better than extinction.
Yes, you describe a view that isn’t my view, and then use that view to criticize intuitions that are similar to my intuitions. The view you describe is making simple errors that should be easy to correct, and my view isn’t. I don’t really know how the group of “people who aren’t too worried about paperclipping” breaks down between “people who underestimate P(paperclipping)” and “people who think paperclipping is ok, even if suboptimal” in numbers, maybe the latter really is rare. But the former group should shrink with some education, and the latter might grow from it.
[Moderator note: I wrote a warning to you on another post a few days ago, so this is your second warning. The next warning will result in a temporary ban.]
Basically everything I said in my last comment still holds:
Since then, it does not seem like you significantly reduced the volume of comments you’ve been writing, and I have not perceived a significant increase in the amount of thought and effort that goes into every single one of your comments. I continue to think that you could be a great contributor to LessWrong, but also think that for that to happen, it seems necessary that you take on significantly more interpretative labor in your comments, and put more effort into being clear. It still appears that most comment exchanges that involve you cause most readers and co-commenters to feel attacked by you or misunderstand you, and quickly get frustrated.
I think it might be the correct call (though I obviously don’t know your constraints and thought-habits around commenting here) to aim to write one comment per day, instead of an average of three, with that one comment having three times as much thought and care put into it, and with particular attention towards trying to be more collaborative, instead of adversarial.
A paperclip-maximizer could turn out to be much, much worse than a nuclear war extinction, depending on how suffering subroutines and acausal trade works.
An AI dedicated to the preservation of the human species but not aligned to any other human values would, I bet, be much much worse than a nuclear war extinction. At least please throw in some sort of ”...in good health and happiness” condition! (And that would not be nearly enough in my opinion)
Is it worse because the maximizer suffers? Why would I care whether it suffers? Why would you assume that I care?
I imagine that the most efficient way to preserve living humans is to keep them unconscious in self-sustaining containers, spread across the universe. You can imagine more dystopian scenarios, but I doubt they are more efficient. Suffering people might try to kill themselves, which is counterproductive from the AI’s point of view.
Also, you’re still assuming that I have some all-overpowering “suffering is bad” value. I don’t. Even if the AI created trillions of humans at maximum levels of suffering, I can still prefer that to a nuclear war extinction (though I’m not sure that I do).