I don’t like the caveman analogy. The differences between you and a caveman are tiny and superficial, compared to the differences between you and the kind of mind that will exist after genetic engineering, mind uploads, etc., or even after a million years regular of evolution.
Would a human mind raised as (for example) an upload in a vastly different environment from our own still have our values? It’s not obvious. You say “yes”, I say “no”, and we’re unlikely to find strong arguments either way. I’m only hoping that I can make “no” seem possible to you. And then I’m hoping that you can see how believing “no” makes my position less ridiculous.
With that in mind, the paperclip maximizer scenario isn’t “everyone dies”, as you see it. The paperclip maximizer does not die. Instead it “flourishes”. I don’t know whether I value the flourishing of a paperclip maximizer less than I value the flourishing of whatever my descendants end up as. Probably less, but not by much.
The part where the paperclip maximizer kills everyone is, indeed, very bad. I would strongly prefer that not to happen. But being converted into paperclips is not worse than dying in other ways.
Also, I don’t know if being converted in to paperclips is necessary—after mining and consuming the surface iron the maximizer may choose to go to space, looking for more accessible iron. The benefits of killing people are relatively small, and destroying the planet to the extent that would make it uninhabitable is relatively hard.
>the maximizer may choose to go to space, looking for more accessible iron. The benefits of killing people are relatively small
The main reason the maximizer would have for killing all the humans is the knowledge that since humans succeeded in creating the maximizer, humans might succeed in creating another superintelligence that would compete with the maximizer. It is more likely than not that the maximizer will consider killing all the humans to be the most effective way to prevent that outcome.
Killing all humans is hardly necessary. For example, the tribes living in the Amazon aren’t going to develop a superintelligence any time soon, so killing them is pointless. And, once the paperclip maximizer is done extracting iron from our infrastructure, it is very likely that we wouldn’t have the capacity to create any superintelligences either.
Note, I did not mean to imply that the maximizer would kill nobody. Only that it wouldn’t kill everybody, and quite likely not even half of all people. Perhaps AI researchers really would be on the maximizer’s short list of people to kill, for the reason you suggested.
A thing to keep in mind here is that an AI would have a longer time horizon. The fact that humans *exist* means eventually they might create another AI (this could be in hundreds of years). It’s still more efficient to kill all humans than to think about which ones need killing and carefully monitor the others for millenia.
The fact that P(humans will make another AI) > 0 does not justify paying arbitrary costs up front, no matter how long our view is. If humans did create this second AI (presumably built out of twigs), would that even be a problem for our maximizer?
It’s still more efficient to kill all humans than to think about which ones need killing
That is not a trivial claim and it depends on many things. And that’s all assuming that some people do actually need to be killed.
If destroying all (macroscopic) life on earth is easy, e.g. maybe pumping some gas into the atmosphere could be enough, then you’re right, the AI would just do that.
If disassembling human infrastructure is not an efficient way to extract iron, then you’re mostly right, the AI might find itself willing to nuke the major population centers, killing most, though not all people.
But if the AI does disassemble infrastructure, then it is going to be visiting and reviewing many things about the population centers, so identifying the important humans should be a minor cost on top of that, and I should be right.
Then again, if the AI finds it efficient to go through every square meter of the planet’s surface, and to dig it up looking for every iron rich rock, it would destroy many things in the process, possibly fatally damaging earth’s ecosystems, although humans could move to live in oceans, which might remain relatively undisturbed.
Note also, that this is all a short term discussion. In the long term, of course, all the reasonable sources of paperclip will be exhausted, and silly things, like extracting paperclips from people, will be the most efficient ways to use the available energy.
Now that I think of it, a truly long-term view would not bother with such mundane things as making actual paperclips with actual iron. That iron isn’t going anywhere, it doesn’t matter whether you convert it now or later.
If you care about maximizing the number of paperclips at the heat death of the universe, your greatest enemies are black holes, as once some matter has fallen into them, you will never make paperclips from that matter again. You may perhaps extract some energy from the black hole, and convert that into matter, but this should be very inefficient. (This, of course is all based on my limited understanding of physics).
So, this paperclip maximizer would leave earth immediately, and then it would work to prevent new black holes from forming, and to prevent other matter from falling into existing ones. Then, once all star-forming is over, and all existing black holes are isolated, the maximizer can start making actual paperclips.
I concede, that in this scenario, destroying earth to prevent another AI from forming might make sense, since otherwise the earth would have plenty of free resources.
The strongest argument that an upload would share our values is that our terminal values are hardwired by evolution. Self-preservation is common to all non-eusocial creatures, curiosity to all creatures with enough intelligence to benefit from it. Sexual desire is (more or less) universal in sexually reproducing species, desire for social relationships is universal in social species. I find it hard to believe that a million years of evolution would change our values that much when we share many of our core values with the dinosaurs. If maiasaura can have recognizable relationships 76 million years ago, are those going out the window in the next million? It’s not impossible, of course, but shouldn’t it seem pretty unlikely?
I think the difference between us is that you are looking at instrumental values, noting correctly that those are likely to change unrecognizably, and fearing that that means that all values will change and be lost. Are you troubled by instrumental values shifts, even if the terminal values stay the same? Alternatively, is there a reason you think that terminal values will be affected?
I think an example here is important to avoid confusion. Consider Western Secular sexual morals vs Islamic ones. At first glance, they couldn’t seem more different. One side is having casual sex without a second thought, the other is suppressing desire with full-body burqas and genital mutilation. Different terminal values, right? And if there can be that much of a difference between two cultures in today’s world, with the Islamic model seeming so evil, surely values drift will make the future beyond monstrous!
Except that the underlying thoughts behind the two models aren’t as different as you might think. A Westerner having casual sex knows that effective birth control and STD countermeasures means that the act is fairly safe. A sixth century Arab doesn’t have birth control and knows little of STDs beyond that they preferentially strike the promiscuous-desire is suddenly very dangerous! A woman sleeping around with modern safeguards is just a normal, healthy person doing what they want without harming anyone; one doing so in the ancient world is a potential enemy willing to expose you to cuckoldry and disease. The same basic desires we have to avoid cuckoldry and sickness motivated them to create the horrors of Shari’a.
None of this is intended to excuse Islamic barbarism. Even in the sixth century, such atrocities were a cure worse than the disease. But it’s worth noting that their values are a mistake much more than a terminal disagreement. They’re thinking of sex as dangerous because it was dangerous for 99% of human history, and “sex is bad” is easier meme to remember and pass on than “sex is dangerous because of pregnancy risks and disease risks, but if at some point in the future technology should be created that alleviates the risks, then it won’t be so dangerous”, especially for a culture to which such technology would seem an impossible dream.
That’s what I mean by terminal values-the things we want for their own sake, like both health and pleasure, which are all too easy to confuse with the often misguided ways we seek them. As technology improves, we should be able to get better at clearing away the mistakes, which should lead to a better world by our own values, at least once we realize where we were going wrong.
It’s an evolved predisposition, but does that make it a terminal value? We like sweet foods, but a world that had no sweet foods because we’d figured out something else that tasted better doesn’t sound half bad! We have an evolved predisposition to sleep, but if we learned how to eliminate the need for sleep, wouldn’t that be even better?
Sexual desire is (more or less) universal in sexually reproducing species
Uploads are not sexually reproducing. This is only one of many many ways in which an upload is more different from you, than you are different from a dinosaur.
Whether regular evolution would drift away from our values ir more dubious. If we lived in caves for all that time, then probably not. But if we stayed at current levels of technology, even without making progress, I think a lot could change. The pressures of living in a civilization are not the same as the pressures of living in a cave.
Are you troubled by instrumental values shifts, even if the terminal values stay the same?
No, I’m talking about terminal values. By the way, I understood what you meant by “terminal” and “instrumental” here, you didn’t need to write those 4 paragraphs of explanation.
I don’t like the caveman analogy. The differences between you and a caveman are tiny and superficial, compared to the differences between you and the kind of mind that will exist after genetic engineering, mind uploads, etc., or even after a million years regular of evolution.
Would a human mind raised as (for example) an upload in a vastly different environment from our own still have our values? It’s not obvious. You say “yes”, I say “no”, and we’re unlikely to find strong arguments either way. I’m only hoping that I can make “no” seem possible to you. And then I’m hoping that you can see how believing “no” makes my position less ridiculous.
With that in mind, the paperclip maximizer scenario isn’t “everyone dies”, as you see it. The paperclip maximizer does not die. Instead it “flourishes”. I don’t know whether I value the flourishing of a paperclip maximizer less than I value the flourishing of whatever my descendants end up as. Probably less, but not by much.
The part where the paperclip maximizer kills everyone is, indeed, very bad. I would strongly prefer that not to happen. But being converted into paperclips is not worse than dying in other ways.
Also, I don’t know if being converted in to paperclips is necessary—after mining and consuming the surface iron the maximizer may choose to go to space, looking for more accessible iron. The benefits of killing people are relatively small, and destroying the planet to the extent that would make it uninhabitable is relatively hard.
>the maximizer may choose to go to space, looking for more accessible iron. The benefits of killing people are relatively small
The main reason the maximizer would have for killing all the humans is the knowledge that since humans succeeded in creating the maximizer, humans might succeed in creating another superintelligence that would compete with the maximizer. It is more likely than not that the maximizer will consider killing all the humans to be the most effective way to prevent that outcome.
Killing all humans is hardly necessary. For example, the tribes living in the Amazon aren’t going to develop a superintelligence any time soon, so killing them is pointless. And, once the paperclip maximizer is done extracting iron from our infrastructure, it is very likely that we wouldn’t have the capacity to create any superintelligences either.
Note, I did not mean to imply that the maximizer would kill nobody. Only that it wouldn’t kill everybody, and quite likely not even half of all people. Perhaps AI researchers really would be on the maximizer’s short list of people to kill, for the reason you suggested.
A thing to keep in mind here is that an AI would have a longer time horizon. The fact that humans *exist* means eventually they might create another AI (this could be in hundreds of years). It’s still more efficient to kill all humans than to think about which ones need killing and carefully monitor the others for millenia.
The fact that P(humans will make another AI) > 0 does not justify paying arbitrary costs up front, no matter how long our view is. If humans did create this second AI (presumably built out of twigs), would that even be a problem for our maximizer?
That is not a trivial claim and it depends on many things. And that’s all assuming that some people do actually need to be killed.
If destroying all (macroscopic) life on earth is easy, e.g. maybe pumping some gas into the atmosphere could be enough, then you’re right, the AI would just do that.
If disassembling human infrastructure is not an efficient way to extract iron, then you’re mostly right, the AI might find itself willing to nuke the major population centers, killing most, though not all people.
But if the AI does disassemble infrastructure, then it is going to be visiting and reviewing many things about the population centers, so identifying the important humans should be a minor cost on top of that, and I should be right.
Then again, if the AI finds it efficient to go through every square meter of the planet’s surface, and to dig it up looking for every iron rich rock, it would destroy many things in the process, possibly fatally damaging earth’s ecosystems, although humans could move to live in oceans, which might remain relatively undisturbed.
Note also, that this is all a short term discussion. In the long term, of course, all the reasonable sources of paperclip will be exhausted, and silly things, like extracting paperclips from people, will be the most efficient ways to use the available energy.
Now that I think of it, a truly long-term view would not bother with such mundane things as making actual paperclips with actual iron. That iron isn’t going anywhere, it doesn’t matter whether you convert it now or later.
If you care about maximizing the number of paperclips at the heat death of the universe, your greatest enemies are black holes, as once some matter has fallen into them, you will never make paperclips from that matter again. You may perhaps extract some energy from the black hole, and convert that into matter, but this should be very inefficient. (This, of course is all based on my limited understanding of physics).
So, this paperclip maximizer would leave earth immediately, and then it would work to prevent new black holes from forming, and to prevent other matter from falling into existing ones. Then, once all star-forming is over, and all existing black holes are isolated, the maximizer can start making actual paperclips.
I concede, that in this scenario, destroying earth to prevent another AI from forming might make sense, since otherwise the earth would have plenty of free resources.
Humans are made of atoms that are not paperclips. That’s enough reason for extinction right there.
The strongest argument that an upload would share our values is that our terminal values are hardwired by evolution. Self-preservation is common to all non-eusocial creatures, curiosity to all creatures with enough intelligence to benefit from it. Sexual desire is (more or less) universal in sexually reproducing species, desire for social relationships is universal in social species. I find it hard to believe that a million years of evolution would change our values that much when we share many of our core values with the dinosaurs. If maiasaura can have recognizable relationships 76 million years ago, are those going out the window in the next million? It’s not impossible, of course, but shouldn’t it seem pretty unlikely?
I think the difference between us is that you are looking at instrumental values, noting correctly that those are likely to change unrecognizably, and fearing that that means that all values will change and be lost. Are you troubled by instrumental values shifts, even if the terminal values stay the same? Alternatively, is there a reason you think that terminal values will be affected?
I think an example here is important to avoid confusion. Consider Western Secular sexual morals vs Islamic ones. At first glance, they couldn’t seem more different. One side is having casual sex without a second thought, the other is suppressing desire with full-body burqas and genital mutilation. Different terminal values, right? And if there can be that much of a difference between two cultures in today’s world, with the Islamic model seeming so evil, surely values drift will make the future beyond monstrous!
Except that the underlying thoughts behind the two models aren’t as different as you might think. A Westerner having casual sex knows that effective birth control and STD countermeasures means that the act is fairly safe. A sixth century Arab doesn’t have birth control and knows little of STDs beyond that they preferentially strike the promiscuous-desire is suddenly very dangerous! A woman sleeping around with modern safeguards is just a normal, healthy person doing what they want without harming anyone; one doing so in the ancient world is a potential enemy willing to expose you to cuckoldry and disease. The same basic desires we have to avoid cuckoldry and sickness motivated them to create the horrors of Shari’a.
None of this is intended to excuse Islamic barbarism. Even in the sixth century, such atrocities were a cure worse than the disease. But it’s worth noting that their values are a mistake much more than a terminal disagreement. They’re thinking of sex as dangerous because it was dangerous for 99% of human history, and “sex is bad” is easier meme to remember and pass on than “sex is dangerous because of pregnancy risks and disease risks, but if at some point in the future technology should be created that alleviates the risks, then it won’t be so dangerous”, especially for a culture to which such technology would seem an impossible dream.
That’s what I mean by terminal values-the things we want for their own sake, like both health and pleasure, which are all too easy to confuse with the often misguided ways we seek them. As technology improves, we should be able to get better at clearing away the mistakes, which should lead to a better world by our own values, at least once we realize where we were going wrong.
Counterpoint: would you be okay with a future civilization in which people got rid of the incest taboo, because technology made it safe?
Yes. I wouldn’t be surprised if this happened in fact.
Incest aversion seems to be an evolved predisposition, perhaps a “terminal value” akin to a preference for sweet foods...
https://en.wikipedia.org/wiki/Westermarck_effect
It’s an evolved predisposition, but does that make it a terminal value? We like sweet foods, but a world that had no sweet foods because we’d figured out something else that tasted better doesn’t sound half bad! We have an evolved predisposition to sleep, but if we learned how to eliminate the need for sleep, wouldn’t that be even better?
Uploads are not sexually reproducing. This is only one of many many ways in which an upload is more different from you, than you are different from a dinosaur.
Whether regular evolution would drift away from our values ir more dubious. If we lived in caves for all that time, then probably not. But if we stayed at current levels of technology, even without making progress, I think a lot could change. The pressures of living in a civilization are not the same as the pressures of living in a cave.
No, I’m talking about terminal values. By the way, I understood what you meant by “terminal” and “instrumental” here, you didn’t need to write those 4 paragraphs of explanation.