If it optimizes this (saves life) and otherwise interferes the least, it already done excellent.
I think the standard sort of response for this is The Hidden Complexity of Wishes. Just off the top of my (non-superintelligent) head, the AI could notice a method for near-perfect continuation of life by preserving some bacteria at the cost of all other life forms.
I did not mean the comment that literally. Dropped too many steps for brevity, thought they were clear, I apologize.
It would be just as impossible (or even more impossible) to convince people that total obliteration of people is a good thing. On the other hand, people don’t care much about bacteria, even whole species of them, and as long as a few specimens remain in laboratories, people will be ok about the rest being obliterated.
It would be just as impossible (or even more impossible) to convince people that total obliteration of people is a good thing.
There are lots of people who do think that’s a good thing, and I don’t think those people are trolling or particularly insane. There are entire communities where people have sterilized themselves as part of a mission to end humanity (for the sake of Nature, or whatever).
I think those people do have insufficient knowledge and intelligence. For example, the skoptsy sect, who believed they followed the God’s will, were, presumably, factually wrong. And people who want to end humanity for the sake of Nature, want that instrumentally—because they believe that otherwise Nature will be destroyed. Assuming FAI is created, this belief is also probably wrong.
You’re right in there being people who would place “all non-intelligent life” before “all people”, if there was such a choice. But that does not mean they would choose “non-intelligent life” before “non-intelligent life + people”.
people who want to end humanity for the sake of Nature, want that instrumentally—because they believe that otherwise Nature will be destroyed. Assuming FAI is created, this belief is also probably wrong.
That depends a lot on what I understand Nature to be. If Nature is something incompatible with artificial structuring, then as soon as a superhuman optimizing system structures my environment, Nature has been destroyed… no matter how many trees and flowers and so forth are left.
Personally, I think caring about Nature as something independent of “trees and flowers and so forth” is kind of goofy, but there do seem to be people who care about that sort of thing.
What if particular arrangements of flowers, trees and soforth are complex and interconnected, in ways that can be undone to the net detriment of said flowers, trees and soforth? Thinking here of attempts at scientifically “managing” forest resources in Germany with the goal of making them as accessible and productive as possible. The resulting tree farms were far less resistant to disease, climatic abberation, and so on, and generally not very healthy, because it turns out that illegible, sloppy factor that made forest seem less-conveniently organized for human uses was a non-negligible portion of what allowed them to be so productive and robust in the first place.
No individual tree or flower is all that important, but the arrangement is, and you can easily destroy it without necessarily destroying any particular tree or flower. I’m not sure what to call this, and it’s definitely not independent of the trees and flowers and soforth, but it can be destroyed to the concrete and demonstrable detriment of what’s left.
I don’t know forestry from my elbow, but I used to read a blog by someone who was pretty into saltwater fish tanks. Now, one property of these tanks is that they’re really sensitive to a bunch of feedback loops that can most easily be stabilized by approximating a wild reef environment; if you get the lighting or the chemical balance of the water wrong, or if you don’t get a well-balanced polyculture of fish and corals and random invertebrates going, the whole system has a tendency to go out of whack and die.
This can be managed to some extent with active modification of the tank, and the health of your tank can be described in terms of how often you need to tweak it. Supposing you get the balance just right, so that you only need to provide the right energy inputs and your tank will live forever: is that Nature? It certainly seems to have the factors that your ersatz German forest lacks, but it’s still basically two hundred enclosed gallons of salt water hooked up to an aeration system.
That’s something like my objection to CEV—I currently believe that some fraction of important knowledge is gained by blundering around and (or?) that the universe is very much more complex than any possible theory about it.
This means that you can’t fully know what your improved (by what standard?) self is going to be like.
I’m not quite sure what you mean to ask by the question. If maintaining a particular arrangement of flowers, trees and so forth significantly helps preserve their health relative to other things I might do, and I value their health, then I ought to maintain that arrangement.
Well, I certainly agree that increasing my knowledge and intelligence might have the effect of changing my beliefs about the world in such a way that I stop valuing certain things that I currently value, and I find it likely that the same is true of everyone else, including the folks who care about Nature.
It’s not even strictly true. It’s entirely conceivable that FAI will lead to the Sol system being converted into a big block of computronium to run human brain simulations. Even if those simulations have trees and animals in them, I think that still counts as the destruction of nature.
But if FAI is based on CEV, then this will only happen if this is the extrapolated wish of everybody. Assuming existence of people truly (even after extrapolation) valuing Nature in its original form, such computroniums won’t be forcefully built.
Nope. CEV that functioned only unanimously wouldn’t function at all. The course of the future would go to the majority faction. Honestly, I think CEV is a convoluted, muddy mess of an idea that attempts to solve the hard question of how to teach the AI what we want by replacing it with the harder question of how to teach it what we should want. But that’s a different debate.
CEV that functioned only unanimously wouldn’t function at all
Why not? I believe that at least one unanimous extrapolated wish exists—for (sentient) life on the planet to continue. If FAI ensured that and left everything else for us to decide, I’d be happy.
That is not by any means guaranteed to be unanimous. I would be very surprised if at least one person didn’t want all sapient life to end, deeply enough for that to persist through extrapolation. I mean, look at all the doomsday cults in the world.
Yes, it is only a hypothesis. Until we actually built an AI with such CEV as utility, we cannot know whether it could function. But at least, running it is uncontroversial by definition.
And I think I’ll be more surprised if anyone was found who really and truly had a terminal value for universal death. With some strain, I can imagine someone preferring it conditionally, but certainly not absolutely. The members of doomsday cults, I expect, are either misinformed, insincere, or unhappy about something else (which FAI could fix!).
Until we actually built an AI with such CEV as utility, we cannot know whether it could function. But at least, running it is uncontroversial by definition.
It’s quite controversial. Supposing CEV worked exactly as expected, I still wouldn’t want it to be done. Neither do some others in this thread. And I’m sure neither would most humans in the street if you were to ask them (and they seriously though about the question).
CEV doesn’t and cannot predict that the extrapolated wishes of everybody will perfectly coincide. Rather, it says it will find the best possible compromise. Of course I would prefer my own values to a compromise! Lacking that, I would prefer a compromise over a smaller group whose members were more similar to myself (such as the group of people actually building the AI).
I might choose CEV over something else because plenty of other things are even worse. But CEV is very very far from the best possible thing, or even the best not-totally-implausible AGI I might expect in my actual future.
And I think I’ll be more surprised if anyone was found who really and truly had a terminal value for universal death
Any true believer in a better afterlife qualifies: there are billions of people who at least profess such beliefs, so I expect some of them really believe.
CEV doesn’t and cannot predict that the extrapolated wishes of everybody will perfectly coincide. Rather, it says it will find the best possible compromise.
What I proposed in this thread is that CEV would forcibly implement only the (extrapolated) wish(es) of literally everyone. Regarding the rest, it is to minimize its influence, leaving all decisions to people.
Any true believer in a better afterlife qualifies
No, because they believe in afterlife. They do not wish for universal death. Extrapolating their wish with correct knowledge solves the problem.
What I proposed in this thread is that CEV would forcibly implement only the (extrapolated) wish(es) of literally everyone.
Well then, as I and others argue elsewhere in the thread, we anticipate there will be no extrapolated wishes that literally everyone agrees on.
(And that’s even without considering some meta formulations of CEV that propose to also take into account the wishes of counterfactual people who might exist in the future, and dead ones who existed in the past.)
No, because they believe in afterlife. They do not wish for universal death. Extrapolating their wish with correct knowledge solves the problem.
Lots of people religiously believe that their god has planned (and prophesied) a specific event of drastic universal change, after which future people will stop suffering in this world, or will stop being born to a life of negative utility (end of the world), or will be rescued from horrible eternal torture (Hell), or which is necessary for the true believers to actually be resurrected or to enter the good afterlife. (Obviously people don’t believe all of this at once; these are variant examples.)
Some others believe that life in this world is suffering, negative utility, and ought to be stopped for its own sake (stopping the cycle of rebirth).
Well, now you know there exist people who believe that there are some universally acceptable wishes. Let’s do the Aumann update :)
Aumann update works only if I believe you’re a perfect Bayesian rationalist. So, no thanks.
Since you aren’t giving any valid examples of universally acceptable wishes (I’ve pointed out people who don’t wish for the examples you gave), why do you believe such wishes exist?
False beliefs ⇒ irrelevant after extrapolation.
Only if you modify these actual people to have their extrapolated beliefs instead of their current ones. Otherwise the false current beliefs will keep on being very relevant to them. Do you want to do that?
Too bad. Let’s just agree to disagree then, until the brain scanning technology is sufficiently advanced.
Or until you provide the evidence that causes you to hold your opinions.
So far, I didn’t see a convincing example of a person who truly wished for everyone to die, even in extrapolation.
I think it’s plausible such people exist. Conversely, if you fine-tune your implementation of “extrapolation” to make their extrapolated values radically different from their current values (and incidentally matching your own current values), that’s not what CEV is supposed to be about. But before talking about that, there’s a more important point:
To them, yes, but not to their CEV.
So why do you care about their extrapolated values? If you think CEV will extrapolate something that matches your current values but not those of many others; and you don’t want to change by force others’ actual values to match their extrapolated ones, so they will suffer in the CEV future; then why extrapolate their values at all? Why not just ignore them and extrapolate your own, if you have the first-mover advantage?
Extrapolated values are the true values. Whereas the current values are approximations, sometimes very bad and corrupted approximations.
What makes you give them such a label as “true”? There is no such thing as a “correct” or “objective” value. Or values are possible in the sense that there can be agents will all possible values, even paperclip-maximizing. The only interesting property of values is who actually holds them. But nobody actually holds your extrapolated values (today).
Current values (and values in general) are not approximations of any other values. All values just are. Why do you call them approximations?
they will suffer in the CEV future
This does not follow.
In your CEV future, the extrapolated values are maximized. Conflicting values, like the actual values held today by many or all people, are necessarily not maximized. In proportion to how much this happens, which is positively correlated to the difference between actual and extrapolated values, people who hold the actual values will suffer living in such a world. (If the AI is a singleton they will not even have a hope of a better future.)
Briefly: suffering ~ failing to achieve your values.
They are reflectively consistent in the limit of infinite knowledge and intelligence. This is a very special and interesting property.
In your CEV future, the extrapolated values are maximized. Conflicting values, like the actual values held today by many or all people, are necessarily not maximized.
But people would change—gaining knowledge and intelligence—and thus would become happier and happier with time. And I think CEV would try to synchronize this with the timing of its optimization process.
They are reflectively consistent in the limit of infinite knowledge and intelligence. This is a very special and interesting property.
Paperclipping is also self-consistent in that limit. That doesn’t make me want to include it in the CEV.
But people would change—gaining knowledge and intelligence—and thus would become happier and happier with time.
Evidence please. There’s a long long leap from ordinary gaining knowledge and intelligence through human life, to “the limit of infinite knowledge and intelligence”. Moreover we’re considering people who currently explicitly value not updating their beliefs in the face of knowledge, and basing their values on faith not evidence. For all I know they’d never approach your limit in the lifetime of the universe, even if it is the limit given infinite time. And meanwhile they’d be very unhappy.
And I think CEV would try to synchronize this with the timing of its optimization process.
So you’re saying it wouldn’t modify the world to fit their new evolved values until they actually evolved those values? Then for all we know it would never do anything at all, and the burden of proof is on you to show otherwise. Or it could modify the world to resemble their partially-evolved values, but then it wouldn’t be a CEV, just a maximizer of whatever values people happen to already have.
Paperclipping is also self-consistent in that limit. That doesn’t make me want to include it in the CEV
Then we can label paperclipping as a “true” value too. However, I still prefer true human values to be maximized, not true clippy values.
Evidence please. There’s a long long leap from ordinary gaining knowledge and intelligence through human life, to “the limit of infinite knowledge and intelligence”. Moreover we’re considering people who currently explicitly value not updating their beliefs in the face of knowledge, and basing their values on faith not evidence. For all I know they’d never approach your limit in the lifetime of the universe, even if it is the limit given infinite time. And meanwhile they’d be very unhappy.
As I said before, if someone’s mind is that incompatible with truth, I’m ok with ignoring their preferences in the actual world. They can be made happy in a simulation, or wireheaded, or whatever the combined other people’s CEV thinks best.
So you’re saying it wouldn’t modify the world to fit their new evolved values until they actually evolved those values?
No, I’m saying, the extrapolated values would probably estimate the optimal speed for their own optimization. You’re right, though, it is all speculations, and the burden of proof is on me. Or on whoever will actually define CEV.
As I said before, if someone’s mind is that incompatible with truth, I’m ok with ignoring their preferences in the actual world. They can be made happy in a simulation, or wireheaded, or whatever the combined other people’s CEV thinks best.
And as I and others said, you haven’t given any evidence that such people are rare or even less than half the population (with respect to some of the values they hold).
You’re right, though, it is all speculations, and the burden of proof is on me.
That’s a good point to end the conversation, then :-)
But at least, running it is uncontroversial by definition.
I’m very dubious of CEV as a model for Friendly AI. I think it’s a bad idea for several reasons. So, not that either.
Also, on topic, recall that, when you extrapolate the volition of crazy people, their volition is not, in particular, more sane. It is more as they would like to be. If you see lizard people, you don’t want to see lizard people less. You want sharpened senses to detect them better. Likewise, if you extrapolate a serial killer, you don’t get Ghandi. You get an incredibly good serial killer.
I’m very dubious of CEV as a model for Friendly AI. I think it’s a bad idea for several reasons. So, not that either.
I don’t see how this is possible. One can be dubious about whether it can be defined in the way it is stated, or whether it can be implemented. But assuming it can, why would it be controversial to fulfill the wish(es) of literally everyone, while affecting everything else the least?
when you extrapolate the volition of crazy people, their volition is not, in particular, more sane
Extrapolating volition includes correcting wrong knowledge and increasing intelligence. So, you do stop seeing lizard people if they don’t exist.
Serial killers are more interesting example. But they too don’t want everyone to die. Assuming serial killers get full knowledge of their condition and sufficient intelligence for understanding it, what would their volition actually be? I don’t know, but I’m sure it’s not universal death.
But assuming it can, why would it be controversial to fulfill the wish(es) of literally everyone, while affecting everything else the least?
Problems:
Extrapolation is poorly defined, and, to me, seems to go in either one of two directions: either you make people more as they would like to be, which throws any ideas of coherence out the window, or you make people ‘better’ a long a specific axis, in which case you’re no longer directing the question back at humanity in a meaningful sense. Even something as simple as removing wrong beliefs (as you imply) would automatically erase any but the very weakest theological notions. There are a lot of people in the world who would die to stop that from happening. So, yes, controversial.
Coherence, one way or another, is unlikely to exist. Humans want a bunch of different things. Smarter, better-informed humans would still want a bunch of different, conflicting things. Trying to satisfy all of them won’t work. Trying to satisfy the majority at the expense of the minorities might get incredibly ugly incredibly fast. I don’t have a better solution at this time, but I don’t think taking some kind of vote over the sum total of humanity is going to produce any kind of coherent plan of action.
Trying to satisfy the majority at the expense of the minorities might get incredibly ugly incredibly fast.
But would that be actually uglier than the status quo? Right now, to a very good approximation, those who were born from the right vagina are satisfied at the expense of those born from the wrong vagina. Is that any better?
I call the Litany of Gendlin on the idea that everyone can’t be fully satisfied at once. And I also call the Fallacy of Gray on the idea that if you can’t do something perfectly, then doing it decently is no better than not doing it at all.
But would that be actually uglier than the status quo?
I don’t know. It conceivably could be, and there would be no possibility of improving it, ever. I’m just saying it might be wise to have a better model before we commit to something for eternity.
For extrapolation to be conceptually plausible, I imagine “knowledge” and “intelligence level” to be independent variables of a mind, knobs to turn. To be sure, this picture looks ridiculous. But assuming, for the sake of argument, that this picture is realizable, extrapolation appears to be definable.
Yes, many religious people wouldn’t want their beliefs erased, but only because they believe them to be true. They wouldn’t oppose increasing their knowledge if they knew it was true knowledge. Cases of belief in belief would be dissolved if it was known that true beliefs were better in all respects, including individual happiness.
Coherence, one way or another, is unlikely to exist. Humans want a bunch of different things...
Yes, I agree with this. But, I believe there exist wishes universal for (extrapolated) humans, among which I think there is the wish for humans to continue existing. I would like for AI to fulfil this wish (and other universal wishes if there are any), while letting people decide everything else for themselves.
AFAIK, CEV is not well-defined or fully specified, except as a declaration of intent, a research direction. Thus, it does not make sense to say whether CEV as a model for FAI does or does not in fact do specific things. It only makes sense to say whether the intention of CEV’s developers for it to do or not do those things, and whether CEV’s specification so far contradicts or does not contradict those things.
AFAIU, CEV’s developers’ intent and CEV’s specification so far (with added “unanimousity” condition, if it is not present in the standard CEV specification) do not contradict my statement.
Just to make sure I understand your claim: you’re asserting that we can identify some set of people in the world right now who are “CEV’s developers,” and if we asked them “does CEV fulfill the wish(es) of literally everyone while affecting everything else the least?” they would agree that it clearly does?
No, because “does CEV fulfill....?” is not a well-defined or fully specified question. But I think, if you asked “whether it is possible to build FAI+CEV in such a way that it fulfills the wish(es) of literally everyone while affecting everything else the least”, they would say they do not know.
I believe that at least one unanimous extrapolated wish exists—for (sentient) life on the planet to continue.
Maybe there are better plans that don’t involve specifically “sentient” “life” continuing of a “planet”, the concepts that could all be broken under sufficient optimization pressure, if they don’t happen to be optimal. The simplest ones are “planet” and “life”: it doesn’t seem like a giant ball of simple elements could be the optimal living arrangement, or biological bodies (“life”, if that’s what you meant) an optimal living substrate.
Think of “sentient life continuing on the planet” as a single concept, extrapolatable in various directions as becomes necessary. So, “planet” can be substituted by something else.
I think the standard sort of response for this is The Hidden Complexity of Wishes. Just off the top of my (non-superintelligent) head, the AI could notice a method for near-perfect continuation of life by preserving some bacteria at the cost of all other life forms.
I did not mean the comment that literally. Dropped too many steps for brevity, thought they were clear, I apologize.
It would be just as impossible (or even more impossible) to convince people that total obliteration of people is a good thing. On the other hand, people don’t care much about bacteria, even whole species of them, and as long as a few specimens remain in laboratories, people will be ok about the rest being obliterated.
There are lots of people who do think that’s a good thing, and I don’t think those people are trolling or particularly insane. There are entire communities where people have sterilized themselves as part of a mission to end humanity (for the sake of Nature, or whatever).
I think those people do have insufficient knowledge and intelligence. For example, the skoptsy sect, who believed they followed the God’s will, were, presumably, factually wrong. And people who want to end humanity for the sake of Nature, want that instrumentally—because they believe that otherwise Nature will be destroyed. Assuming FAI is created, this belief is also probably wrong.
You’re right in there being people who would place “all non-intelligent life” before “all people”, if there was such a choice. But that does not mean they would choose “non-intelligent life” before “non-intelligent life + people”.
That depends a lot on what I understand Nature to be.
If Nature is something incompatible with artificial structuring, then as soon as a superhuman optimizing system structures my environment, Nature has been destroyed… no matter how many trees and flowers and so forth are left.
Personally, I think caring about Nature as something independent of “trees and flowers and so forth” is kind of goofy, but there do seem to be people who care about that sort of thing.
What if particular arrangements of flowers, trees and soforth are complex and interconnected, in ways that can be undone to the net detriment of said flowers, trees and soforth? Thinking here of attempts at scientifically “managing” forest resources in Germany with the goal of making them as accessible and productive as possible. The resulting tree farms were far less resistant to disease, climatic abberation, and so on, and generally not very healthy, because it turns out that illegible, sloppy factor that made forest seem less-conveniently organized for human uses was a non-negligible portion of what allowed them to be so productive and robust in the first place.
No individual tree or flower is all that important, but the arrangement is, and you can easily destroy it without necessarily destroying any particular tree or flower. I’m not sure what to call this, and it’s definitely not independent of the trees and flowers and soforth, but it can be destroyed to the concrete and demonstrable detriment of what’s left.
That’s an interesting question, actually.
I don’t know forestry from my elbow, but I used to read a blog by someone who was pretty into saltwater fish tanks. Now, one property of these tanks is that they’re really sensitive to a bunch of feedback loops that can most easily be stabilized by approximating a wild reef environment; if you get the lighting or the chemical balance of the water wrong, or if you don’t get a well-balanced polyculture of fish and corals and random invertebrates going, the whole system has a tendency to go out of whack and die.
This can be managed to some extent with active modification of the tank, and the health of your tank can be described in terms of how often you need to tweak it. Supposing you get the balance just right, so that you only need to provide the right energy inputs and your tank will live forever: is that Nature? It certainly seems to have the factors that your ersatz German forest lacks, but it’s still basically two hundred enclosed gallons of salt water hooked up to an aeration system.
That’s something like my objection to CEV—I currently believe that some fraction of important knowledge is gained by blundering around and (or?) that the universe is very much more complex than any possible theory about it.
This means that you can’t fully know what your improved (by what standard?) self is going to be like.
It’s the difference between the algorithm and its output, and the local particulars of portions of that output.
I’m not quite sure what you mean to ask by the question. If maintaining a particular arrangement of flowers, trees and so forth significantly helps preserve their health relative to other things I might do, and I value their health, then I ought to maintain that arrangement.
Presumably, because their knowledge and intelligence are not extrapolated enough.
Well, I certainly agree that increasing my knowledge and intelligence might have the effect of changing my beliefs about the world in such a way that I stop valuing certain things that I currently value, and I find it likely that the same is true of everyone else, including the folks who care about Nature.
Not that I’m a proponent of voluntary human extinction, but that’s an awfully big conditional.
It’s not even strictly true. It’s entirely conceivable that FAI will lead to the Sol system being converted into a big block of computronium to run human brain simulations. Even if those simulations have trees and animals in them, I think that still counts as the destruction of nature.
But if FAI is based on CEV, then this will only happen if this is the extrapolated wish of everybody. Assuming existence of people truly (even after extrapolation) valuing Nature in its original form, such computroniums won’t be forcefully built.
Nope. CEV that functioned only unanimously wouldn’t function at all. The course of the future would go to the majority faction. Honestly, I think CEV is a convoluted, muddy mess of an idea that attempts to solve the hard question of how to teach the AI what we want by replacing it with the harder question of how to teach it what we should want. But that’s a different debate.
Why not? I believe that at least one unanimous extrapolated wish exists—for (sentient) life on the planet to continue. If FAI ensured that and left everything else for us to decide, I’d be happy.
Antinatalists exist.
That is not by any means guaranteed to be unanimous. I would be very surprised if at least one person didn’t want all sapient life to end, deeply enough for that to persist through extrapolation. I mean, look at all the doomsday cults in the world.
Yes, it is only a hypothesis. Until we actually built an AI with such CEV as utility, we cannot know whether it could function. But at least, running it is uncontroversial by definition.
And I think I’ll be more surprised if anyone was found who really and truly had a terminal value for universal death. With some strain, I can imagine someone preferring it conditionally, but certainly not absolutely. The members of doomsday cults, I expect, are either misinformed, insincere, or unhappy about something else (which FAI could fix!).
It’s quite controversial. Supposing CEV worked exactly as expected, I still wouldn’t want it to be done. Neither do some others in this thread. And I’m sure neither would most humans in the street if you were to ask them (and they seriously though about the question).
CEV doesn’t and cannot predict that the extrapolated wishes of everybody will perfectly coincide. Rather, it says it will find the best possible compromise. Of course I would prefer my own values to a compromise! Lacking that, I would prefer a compromise over a smaller group whose members were more similar to myself (such as the group of people actually building the AI).
I might choose CEV over something else because plenty of other things are even worse. But CEV is very very far from the best possible thing, or even the best not-totally-implausible AGI I might expect in my actual future.
Any true believer in a better afterlife qualifies: there are billions of people who at least profess such beliefs, so I expect some of them really believe.
What I proposed in this thread is that CEV would forcibly implement only the (extrapolated) wish(es) of literally everyone. Regarding the rest, it is to minimize its influence, leaving all decisions to people.
No, because they believe in afterlife. They do not wish for universal death. Extrapolating their wish with correct knowledge solves the problem.
Well then, as I and others argue elsewhere in the thread, we anticipate there will be no extrapolated wishes that literally everyone agrees on.
(And that’s even without considering some meta formulations of CEV that propose to also take into account the wishes of counterfactual people who might exist in the future, and dead ones who existed in the past.)
Lots of people religiously believe that their god has planned (and prophesied) a specific event of drastic universal change, after which future people will stop suffering in this world, or will stop being born to a life of negative utility (end of the world), or will be rescued from horrible eternal torture (Hell), or which is necessary for the true believers to actually be resurrected or to enter the good afterlife. (Obviously people don’t believe all of this at once; these are variant examples.)
Some others believe that life in this world is suffering, negative utility, and ought to be stopped for its own sake (stopping the cycle of rebirth).
Well, now you know there exist people who believe that there are some universally acceptable wishes. Let’s do the Aumann update :)
False beliefs ⇒ irrelevant after extrapolation.
False beliefs (rebirth, existence of nirvana state) ⇒ irrelevant after extrapolation.
Aumann update works only if I believe you’re a perfect Bayesian rationalist. So, no thanks.
Since you aren’t giving any valid examples of universally acceptable wishes (I’ve pointed out people who don’t wish for the examples you gave), why do you believe such wishes exist?
Only if you modify these actual people to have their extrapolated beliefs instead of their current ones. Otherwise the false current beliefs will keep on being very relevant to them. Do you want to do that?
Too bad. Let’s just agree to disagree then, until the brain scanning technology is sufficiently advanced.
So far, I didn’t see a convincing example of a person who truly wished for everyone to die, even in extrapolation.
To them, yes, but not to their CEV.
Or until you provide the evidence that causes you to hold your opinions.
I think it’s plausible such people exist. Conversely, if you fine-tune your implementation of “extrapolation” to make their extrapolated values radically different from their current values (and incidentally matching your own current values), that’s not what CEV is supposed to be about. But before talking about that, there’s a more important point:
So why do you care about their extrapolated values? If you think CEV will extrapolate something that matches your current values but not those of many others; and you don’t want to change by force others’ actual values to match their extrapolated ones, so they will suffer in the CEV future; then why extrapolate their values at all? Why not just ignore them and extrapolate your own, if you have the first-mover advantage?
Extrapolated values are the true values. Whereas the current values are approximations, sometimes very bad and corrupted approximations.
This does not follow.
What makes you give them such a label as “true”? There is no such thing as a “correct” or “objective” value. Or values are possible in the sense that there can be agents will all possible values, even paperclip-maximizing. The only interesting property of values is who actually holds them. But nobody actually holds your extrapolated values (today).
Current values (and values in general) are not approximations of any other values. All values just are. Why do you call them approximations?
In your CEV future, the extrapolated values are maximized. Conflicting values, like the actual values held today by many or all people, are necessarily not maximized. In proportion to how much this happens, which is positively correlated to the difference between actual and extrapolated values, people who hold the actual values will suffer living in such a world. (If the AI is a singleton they will not even have a hope of a better future.)
Briefly: suffering ~ failing to achieve your values.
They are reflectively consistent in the limit of infinite knowledge and intelligence. This is a very special and interesting property.
But people would change—gaining knowledge and intelligence—and thus would become happier and happier with time. And I think CEV would try to synchronize this with the timing of its optimization process.
Paperclipping is also self-consistent in that limit. That doesn’t make me want to include it in the CEV.
Evidence please. There’s a long long leap from ordinary gaining knowledge and intelligence through human life, to “the limit of infinite knowledge and intelligence”. Moreover we’re considering people who currently explicitly value not updating their beliefs in the face of knowledge, and basing their values on faith not evidence. For all I know they’d never approach your limit in the lifetime of the universe, even if it is the limit given infinite time. And meanwhile they’d be very unhappy.
So you’re saying it wouldn’t modify the world to fit their new evolved values until they actually evolved those values? Then for all we know it would never do anything at all, and the burden of proof is on you to show otherwise. Or it could modify the world to resemble their partially-evolved values, but then it wouldn’t be a CEV, just a maximizer of whatever values people happen to already have.
Then we can label paperclipping as a “true” value too. However, I still prefer true human values to be maximized, not true clippy values.
As I said before, if someone’s mind is that incompatible with truth, I’m ok with ignoring their preferences in the actual world. They can be made happy in a simulation, or wireheaded, or whatever the combined other people’s CEV thinks best.
No, I’m saying, the extrapolated values would probably estimate the optimal speed for their own optimization. You’re right, though, it is all speculations, and the burden of proof is on me. Or on whoever will actually define CEV.
And as I and others said, you haven’t given any evidence that such people are rare or even less than half the population (with respect to some of the values they hold).
That’s a good point to end the conversation, then :-)
I’m very dubious of CEV as a model for Friendly AI. I think it’s a bad idea for several reasons. So, not that either.
Also, on topic, recall that, when you extrapolate the volition of crazy people, their volition is not, in particular, more sane. It is more as they would like to be. If you see lizard people, you don’t want to see lizard people less. You want sharpened senses to detect them better. Likewise, if you extrapolate a serial killer, you don’t get Ghandi. You get an incredibly good serial killer.
I don’t see how this is possible. One can be dubious about whether it can be defined in the way it is stated, or whether it can be implemented. But assuming it can, why would it be controversial to fulfill the wish(es) of literally everyone, while affecting everything else the least?
Extrapolating volition includes correcting wrong knowledge and increasing intelligence. So, you do stop seeing lizard people if they don’t exist.
Serial killers are more interesting example. But they too don’t want everyone to die. Assuming serial killers get full knowledge of their condition and sufficient intelligence for understanding it, what would their volition actually be? I don’t know, but I’m sure it’s not universal death.
Problems:
Extrapolation is poorly defined, and, to me, seems to go in either one of two directions: either you make people more as they would like to be, which throws any ideas of coherence out the window, or you make people ‘better’ a long a specific axis, in which case you’re no longer directing the question back at humanity in a meaningful sense. Even something as simple as removing wrong beliefs (as you imply) would automatically erase any but the very weakest theological notions. There are a lot of people in the world who would die to stop that from happening. So, yes, controversial.
Coherence, one way or another, is unlikely to exist. Humans want a bunch of different things. Smarter, better-informed humans would still want a bunch of different, conflicting things. Trying to satisfy all of them won’t work. Trying to satisfy the majority at the expense of the minorities might get incredibly ugly incredibly fast. I don’t have a better solution at this time, but I don’t think taking some kind of vote over the sum total of humanity is going to produce any kind of coherent plan of action.
But would that be actually uglier than the status quo? Right now, to a very good approximation, those who were born from the right vagina are satisfied at the expense of those born from the wrong vagina. Is that any better?
I call the Litany of Gendlin on the idea that everyone can’t be fully satisfied at once. And I also call the Fallacy of Gray on the idea that if you can’t do something perfectly, then doing it decently is no better than not doing it at all.
I don’t know. It conceivably could be, and there would be no possibility of improving it, ever. I’m just saying it might be wise to have a better model before we commit to something for eternity.
For extrapolation to be conceptually plausible, I imagine “knowledge” and “intelligence level” to be independent variables of a mind, knobs to turn. To be sure, this picture looks ridiculous. But assuming, for the sake of argument, that this picture is realizable, extrapolation appears to be definable.
Yes, many religious people wouldn’t want their beliefs erased, but only because they believe them to be true. They wouldn’t oppose increasing their knowledge if they knew it was true knowledge. Cases of belief in belief would be dissolved if it was known that true beliefs were better in all respects, including individual happiness.
Yes, I agree with this. But, I believe there exist wishes universal for (extrapolated) humans, among which I think there is the wish for humans to continue existing. I would like for AI to fulfil this wish (and other universal wishes if there are any), while letting people decide everything else for themselves.
It is not clear that CEV as a model for FAI does either of those things.
AFAIK, CEV is not well-defined or fully specified, except as a declaration of intent, a research direction. Thus, it does not make sense to say whether CEV as a model for FAI does or does not in fact do specific things. It only makes sense to say whether the intention of CEV’s developers for it to do or not do those things, and whether CEV’s specification so far contradicts or does not contradict those things.
AFAIU, CEV’s developers’ intent and CEV’s specification so far (with added “unanimousity” condition, if it is not present in the standard CEV specification) do not contradict my statement.
Just to make sure I understand your claim: you’re asserting that we can identify some set of people in the world right now who are “CEV’s developers,” and if we asked them “does CEV fulfill the wish(es) of literally everyone while affecting everything else the least?” they would agree that it clearly does?
No, because “does CEV fulfill....?” is not a well-defined or fully specified question. But I think, if you asked “whether it is possible to build FAI+CEV in such a way that it fulfills the wish(es) of literally everyone while affecting everything else the least”, they would say they do not know.
Ah, OK. I completely misunderstood your claim, then. Thanks for clarifying.
Maybe there are better plans that don’t involve specifically “sentient” “life” continuing of a “planet”, the concepts that could all be broken under sufficient optimization pressure, if they don’t happen to be optimal. The simplest ones are “planet” and “life”: it doesn’t seem like a giant ball of simple elements could be the optimal living arrangement, or biological bodies (“life”, if that’s what you meant) an optimal living substrate.
I assume FAI, which includes full (super-)human understanding of what is actually meant by “sentient life to continue”.
“Planet” is a “planet”, even if you should be working on something else, which is what I meant by usual concepts breaking down.
Think of “sentient life continuing on the planet” as a single concept, extrapolatable in various directions as becomes necessary. So, “planet” can be substituted by something else.
But it’s the only relevant one, when we’re talking about CEV. CEV is only useful if FAI is created, so we can take it for granted.
Ah, the FAI problem in a nutshell.