It’s much easier to get donations and avoid potential political problems (like the US government intervening) if EY is implementing the CEV of all humans rather than the CEV of everyone who donates. If EY seems like a mad scientist hellbent on taking over the world for himself and a small group of people, many people will treat him appropriately. Just think about your gut-level reaction to hearing “EY wants to implement CEV for only SIAI volunteers and donors” and to “EY wants to implement CEV for all of humanity.”
Note: not that there won’t be any political problems with CEV for all humans. Rather that pushing for CEV for a small group of people will cause more problems in this arena.
“Just think about your gut-level reaction to hearing “EY wants to implement CEV for only SIAI volunteers and donors” and to “EY wants to implement CEV for all of humanity.”″
The first actually sounds better to me. I am fairly certain most SIAI-involved people are well-meaning, or at very least would not choose to cause J Random Stranger any harm if they could help it. I’m not so certain about ‘all of humanity’.
The relevant comparison isn’t what ‘all of humanity’ would choose, but rather what all of humanity would choose once CEV is done with their preferences.
This has been a source of confusion to me about the theory since I first encountered it, actually.
Given that this hypothetical CEV-extracting process gets results that aren’t necessarily anything that any individual actually wants, how do we tell the difference between an actual CEV-extracting process and something that was intended as a CEV-extracting process but that, due to a couple of subtle bugs in its code, is actually producing something other than its target’s CEV?
Is the idea that humanity’s actual CEV is something that, although we can’t necessarily come up with it ourselves, is so obviously the right answer once it’s pointed out to us that we’ll all nod our heads and go “Of course!” in unison?
Or is there some other testable property that only HACEV has? What property, and how do we test for it?
Because without such a testable property, I really don’t see why we believe flipping the switch on the AI that instantiates it is at all safe.
I have visions of someone perusing the resulting CEV assembled by the seed AI and going “Um… wait. If I’m understanding this correctly, the AI you instantiate to implement CEV will cause us all to walk around with watermelons on their feet.”
“Yes,” replies the seed AI, “that’s correct. It appears that humans really would want that, given enough time to think together about their footwear preferences.”
“Oh… well, OK,” says the peruser. “If you say so...”
In light of some later comment-threads on related subjects, and in the absence of any direct explanations, I tentatively (20-40% confidence) conclude that the attitude is that the process that generates the code that extracts the CEV that implements the FAI has to be perfect, in order to ensure that the FAI is perfect, which is important because even an epsilon deviation from perfection multiplied by the potential utility of a perfect FAI represents a huge disutility that might leave us vomiting happily on the sands of Mars.
And since testing is not a reliable process for achieving perfection, merely for reducing defects to epsilon, it seems to follow that testing simply isn’t relevant. We don’t test the CEV-generator, by this view; rather we develop it in such a way that we know it’s correct.
And once we’ve done that, we should be more willing to trust the CEV-generator’s view of what we really want than our own view (which is demonstrably unreliable).
So if it turns out to involve wearing watermelons on our feet (or living gender-segregated lives on different planets, or whatever it turns out to be) we should accept that that really is our extrapolated volition, and be grateful, even if our immediate emotional reaction is confusion, disgust, or dismay.
I hasten to add that I’m not supporting this view, just trying to understand it.
Given the choice between (apparently benevolent people’s volition) + (unpredictable factor) or (all people’s volition) + (random factor) I’d choose the former every time.
It’s also entirely plausible that “implement CEV for Americans” will get less US government intervention than “implement CEV for all humanity,” assuming that the US gov’t takes any official notice of any of this in the first place.
It’s not entirely clear to me what follows from any of this political speculation, though.
I don’t want to live with the wrong CEV for all eternity because it was politically expedient.
Just think about your gut-level reaction to hearing “EY wants to implement CEV for only SIAI volunteers and donors” and to “EY wants to implement CEV for all of humanity.”
Personally I would much prefer the former—and I’m not a SIAI volunteer or donor (although I could then become one).
Personally I would much prefer the former—and I’m not a SIAI volunteer or donor (although I could then become one).
Fair enough, but I was talking about your gut level reaction as a proxy for the rest of humanity’s gut level reaction. You may not have the reaction, but most of the rest of the world would see “CEV for only us” and think mad scientist/doomsday cult/etc. because the pattern fits.
Most of the world are going to see the words “AI” and “Singularity”, think mad scientist, and send troops. The word “CEV” they’re going to ignore, because it’s unfamiliar and the media won’t tell them what it is.
Depends on your definition of “the public”. Most people in the world population have certainly never heard of “the singularity” and while they may have heard about the Hollywood concept of “AI” (which actually portraits UFAI pretty well, except that the Hollywood versions are normally stupider-than-humans) they know nothing about AI as it exists or might exist in reality.
More to the point, very few people in the world have thought seriously about either topic, or ever will. I expect that most people will accept a version deriving from something presented in the media. Among the things the media might present, “mad science” ranks high: it’s likely they’ll call it “science” (or technology/engineering), and they will surely present it as impossible and/or undesirable, which makes it mad.
Mad science, even Evil Mad Science, is really not so bad and may be a mark of respect. Contrast it with the popular image of Evil Science, like Nazi scientists doing human experiments. Or Unnatural Science, the Frankenstein meme (which the public image of cryonics barely skirts).
The other image the SIngularity is tainted with in the public mind is, of course, “the rapture of the nerds”: atheist geeks reinventing silly religion and starting cults (like LW). In other words, madness without the science. Mad science would be an upgrade to the SIngularity’s public image right now. Mad science is something people take a little seriously, because it just might work, or at least leave a really big hole.
Test my hypothesis! Try to explain the concept of a fooming AI-driven singularity to anyone who hasn’t heard of it in depth, in 5 minutes—more than most people will spend on listening to the media or thinking about the subject before reaching a conclusion. See if you can, even deliberately, make them reach any conclusion other than “mad scientist” or “science-religious cultist” or “just mad”.
Test my hypothesis! Try to explain the concept of a fooming AI-driven singularity to anyone who hasn’t heard of it in depth, in 5 minutes—more than most people will spend on listening to the media or thinking about the subject before reaching a conclusion. See if you can, even deliberately, make them reach any conclusion other than “mad scientist” or “science-religious cultist” or “just mad”.
Explaining it to geeks is easy enough IME. (“There’s no reason an AI would be anything like a human or care about anything humans care about, so it might increase its power then kill us all by accident. Friendly AI is the quest to make an AI that actually cares about humans.”) Non-geeks, I suspect results like you describe.
For non-geeks, I would drop the word “intelligence”, which carries too much baggage.
“Machines that can improve their ability to improve themselves can improve very quickly—much faster than you might expect if you don’t look at the math. And if a machine quickly self-improves to the point where it can change the world in radical ways, those changes might make us really unhappy or even kill us all. So we want self-improving machines to be ‘Friendly’—that is, we want them to be designed in such a way that the changes they make to themselves and their environment are good for humans. The upside is that a Friendly self-improving machine can also make the environment much, much, much better than you might expect… for example, it can develop improved technologies, cures for diseases, more reliable economic models, extend longevity, etc.”
Come to think of it, that might be better for many geeks as well, who are not immune to the baggage of “intelligence”. Though many would likely be offended by my saying so.
Yes—and geeks are not representative of the population at large, and not at all representative of powerful individuals (politicians, government officials, army commanders, rich businessmen). Even with geeks, I expect a success rate well below 100% due to future shock and imperfect updating.
It’s much easier to get donations and avoid potential political problems (like the US government intervening) if EY is implementing the CEV of all humans rather than the CEV of everyone who donates. If EY seems like a mad scientist hellbent on taking over the world for himself and a small group of people, many people will treat him appropriately. Just think about your gut-level reaction to hearing “EY wants to implement CEV for only SIAI volunteers and donors” and to “EY wants to implement CEV for all of humanity.”
Note: not that there won’t be any political problems with CEV for all humans. Rather that pushing for CEV for a small group of people will cause more problems in this arena.
“Just think about your gut-level reaction to hearing “EY wants to implement CEV for only SIAI volunteers and donors” and to “EY wants to implement CEV for all of humanity.”″
The first actually sounds better to me. I am fairly certain most SIAI-involved people are well-meaning, or at very least would not choose to cause J Random Stranger any harm if they could help it. I’m not so certain about ‘all of humanity’.
The relevant comparison isn’t what ‘all of humanity’ would choose, but rather what all of humanity would choose once CEV is done with their preferences.
This has been a source of confusion to me about the theory since I first encountered it, actually.
Given that this hypothetical CEV-extracting process gets results that aren’t necessarily anything that any individual actually wants, how do we tell the difference between an actual CEV-extracting process and something that was intended as a CEV-extracting process but that, due to a couple of subtle bugs in its code, is actually producing something other than its target’s CEV?
Is the idea that humanity’s actual CEV is something that, although we can’t necessarily come up with it ourselves, is so obviously the right answer once it’s pointed out to us that we’ll all nod our heads and go “Of course!” in unison?
Or is there some other testable property that only HACEV has? What property, and how do we test for it?
Because without such a testable property, I really don’t see why we believe flipping the switch on the AI that instantiates it is at all safe.
I have visions of someone perusing the resulting CEV assembled by the seed AI and going “Um… wait. If I’m understanding this correctly, the AI you instantiate to implement CEV will cause us all to walk around with watermelons on their feet.”
“Yes,” replies the seed AI, “that’s correct. It appears that humans really would want that, given enough time to think together about their footwear preferences.”
“Oh… well, OK,” says the peruser. “If you say so...”
Surely I’m missing something?
In light of some later comment-threads on related subjects, and in the absence of any direct explanations, I tentatively (20-40% confidence) conclude that the attitude is that the process that generates the code that extracts the CEV that implements the FAI has to be perfect, in order to ensure that the FAI is perfect, which is important because even an epsilon deviation from perfection multiplied by the potential utility of a perfect FAI represents a huge disutility that might leave us vomiting happily on the sands of Mars.
And since testing is not a reliable process for achieving perfection, merely for reducing defects to epsilon, it seems to follow that testing simply isn’t relevant. We don’t test the CEV-generator, by this view; rather we develop it in such a way that we know it’s correct.
And once we’ve done that, we should be more willing to trust the CEV-generator’s view of what we really want than our own view (which is demonstrably unreliable).
So if it turns out to involve wearing watermelons on our feet (or living gender-segregated lives on different planets, or whatever it turns out to be) we should accept that that really is our extrapolated volition, and be grateful, even if our immediate emotional reaction is confusion, disgust, or dismay.
I hasten to add that I’m not supporting this view, just trying to understand it.
Given the choice between (apparently benevolent people’s volition) + (unpredictable factor) or (all people’s volition) + (random factor) I’d choose the former every time.
Extrapolating volition doesn’t make agree with mine.
eh?
It’s also entirely plausible that “implement CEV for Americans” will get less US government intervention than “implement CEV for all humanity,” assuming that the US gov’t takes any official notice of any of this in the first place.
It’s not entirely clear to me what follows from any of this political speculation, though.
I don’t want to live with the wrong CEV for all eternity because it was politically expedient.
Personally I would much prefer the former—and I’m not a SIAI volunteer or donor (although I could then become one).
Fair enough, but I was talking about your gut level reaction as a proxy for the rest of humanity’s gut level reaction. You may not have the reaction, but most of the rest of the world would see “CEV for only us” and think mad scientist/doomsday cult/etc. because the pattern fits.
Most of the world are going to see the words “AI” and “Singularity”, think mad scientist, and send troops. The word “CEV” they’re going to ignore, because it’s unfamiliar and the media won’t tell them what it is.
You really think the public associates “AI” and “Singularity” with mad scientist? That seems like an exaggeration to me.
Depends on your definition of “the public”. Most people in the world population have certainly never heard of “the singularity” and while they may have heard about the Hollywood concept of “AI” (which actually portraits UFAI pretty well, except that the Hollywood versions are normally stupider-than-humans) they know nothing about AI as it exists or might exist in reality.
More to the point, very few people in the world have thought seriously about either topic, or ever will. I expect that most people will accept a version deriving from something presented in the media. Among the things the media might present, “mad science” ranks high: it’s likely they’ll call it “science” (or technology/engineering), and they will surely present it as impossible and/or undesirable, which makes it mad.
Mad science, even Evil Mad Science, is really not so bad and may be a mark of respect. Contrast it with the popular image of Evil Science, like Nazi scientists doing human experiments. Or Unnatural Science, the Frankenstein meme (which the public image of cryonics barely skirts).
The other image the SIngularity is tainted with in the public mind is, of course, “the rapture of the nerds”: atheist geeks reinventing silly religion and starting cults (like LW). In other words, madness without the science. Mad science would be an upgrade to the SIngularity’s public image right now. Mad science is something people take a little seriously, because it just might work, or at least leave a really big hole.
Test my hypothesis! Try to explain the concept of a fooming AI-driven singularity to anyone who hasn’t heard of it in depth, in 5 minutes—more than most people will spend on listening to the media or thinking about the subject before reaching a conclusion. See if you can, even deliberately, make them reach any conclusion other than “mad scientist” or “science-religious cultist” or “just mad”.
Explaining it to geeks is easy enough IME. (“There’s no reason an AI would be anything like a human or care about anything humans care about, so it might increase its power then kill us all by accident. Friendly AI is the quest to make an AI that actually cares about humans.”) Non-geeks, I suspect results like you describe.
For non-geeks, I would drop the word “intelligence”, which carries too much baggage.
“Machines that can improve their ability to improve themselves can improve very quickly—much faster than you might expect if you don’t look at the math. And if a machine quickly self-improves to the point where it can change the world in radical ways, those changes might make us really unhappy or even kill us all. So we want self-improving machines to be ‘Friendly’—that is, we want them to be designed in such a way that the changes they make to themselves and their environment are good for humans. The upside is that a Friendly self-improving machine can also make the environment much, much, much better than you might expect… for example, it can develop improved technologies, cures for diseases, more reliable economic models, extend longevity, etc.”
Come to think of it, that might be better for many geeks as well, who are not immune to the baggage of “intelligence”. Though many would likely be offended by my saying so.
Yes—and geeks are not representative of the population at large, and not at all representative of powerful individuals (politicians, government officials, army commanders, rich businessmen). Even with geeks, I expect a success rate well below 100% due to future shock and imperfect updating.
‘Accordingly’ would seem to be the appropriate word in the context.