What’s with the ems? People who are into ems seem to make a lot of assumptions about what ems are like and seem completely unattached to present-day culture or even structure of life, seem willing to spam duplicates of people around, etc. I know that Hanson thinks that 1. ems will not be robbed of their humanity and 2. that lots of things we currently consider horrible will come to pass and be accepted, but it’s rather strange just how as soon as people say ‘em’ (as opposed to any other form of uploading) everything gets weird. Does anthropics come into it?
Why the huge focus on fully paternalistic Friendly AI rather than Obedient AI? It seems like a much lower-risk project. (and yes, I’m aware of the need for Friendliness in Obedient AI.)
The first problem is the wish would have to be extremely good at pointing.
This sounds silly but what I mean is that humans are COMPLICATED. “Pointing” at a human and telling an AI to deduce things about it will come up with HUGE swathes of data which you have to have already prepared it to ignore or pay attention to. To give a classic simple example, smiles are a sign of happiness but we do not want to tile the universe in smiley faces or create an artificial virus that constricts your face into a rictus and is highly contagious.
Second: assuming that works, it works primarily for one person, which is giving that person a lot more power than I think most people want to give any one person. But if we could guarantee an AI would fulfill the values of A person rather than of multiple people and someone else was developing AI that wasn’t guarunteed to fulfill any values I’d probably take it.
To spell out some of the complications—does the genie only respond to verbal commands? What if the human is temporarily angry at someone and an internal part of their brain wishes them harm. The genie needs to know not to act on this. So it must have some kind of requirement for reflective equilibrium.
Suppose the human is duped into pursuing some unwise course of action? The genie needs to reject their new wishes. But the human should still be able to have their morality evolve over time.
So you still need a complete CV Extrapolator. But maybe that’s what you had in mind be pointing at the wishes of a particular human?
I’m not talking naive obedient AI here. I’m talking a much less meta FAI that does not do analysis of metaethics or CEV or do incredibly vague, subtle wishes. (Atlantis in HPMOR may be an example of a very weak, rather irrational, poorly safeguarded Obedient AI with a very, very strange command set.)
Basically it’s a matter of natural selection. Given a starting population of EMs, if some are unwilling to be copied, the ones that are willing to be copied will dominate the population in short order. If EMs are useful for work, eg valuable, then the more valuable ones will be copied more often. At that point, EMs that are willing to be copied and do slave labor effectively for no complaints will become the most copied, and the population of ems will end up being composed largely of copies of the person/people who are 1) ok with being copied, 2) ok with being modified to work more effectively.
People who are into ems seem to make a lot of assumptions about what ems are like and seem completely unattached to present-day culture or even structure of life, seem willing to spam duplicates of people around, etc. I know that Hanson thinks that 1. ems will not be robbed of their humanity and 2. that lots of things we currently consider horrible will come to pass and be accepted, but it’s rather strange just how as soon as people say ‘em’ (as opposed to any other form of uploading) everything gets weird.
with an answer
Basically it’s a matter of natural selection. Given a starting population of EMs, if some are unwilling to be copied, the ones that are willing to be copied will dominate the population in short order. If EMs are useful for work, eg valuable, then the more valuable ones will be copied more often. At that point, EMs that are willing to be copied and do slave labor effectively for no complaints will become the most copied, and the population of ems will end up being composed largely of copies of the person/people who are 1) ok with being copied, 2) ok with being modified to work more effectively.
Doesn’t make too much sense. Not to mention that those are not questions.
The questions in the post are
What’s with the ems?
Does anthropics come into it?
Why the huge focus on fully paternalistic Friendly AI rather than Obedient AI?
At least to me the answer doesn’t seem to fit any of those but I guess the community highly disagrees with me given the upvotes and downvotes.
Question 1 doesn’t end with the question mark; the next two sentences explain the intention of asking “what’s with the ems?” Which would otherwise be a hopelessly vague question, but becomes clear enough with the help. Charitable interpretation trumps exact punctuation as an interpretive guide.
Well, no offense, but I’m not sure you are aware of the need for Friendliness in Obedient AI, or rather, just how much F you need in a genie.
If you were to actually figure out how to build a genie you would have figured it out by trying to build a CEV-class AI, intending to tackle all those challenges, tackling all those challenges, having pretty good solutions to all of those challenges, not trusting those solutions quite enough, and temporarily retreating to a mere genie which had ALL of the safety measures one would intuitively imagine necessary for a CEV-class independently-acting unchecked AI, to the best grade you could currently implement them. Anyone who thought they could skip the hard parts of CEV-class FAI by just building a genie instead, would die like a squirrel under a lawnmower. For reasons they didn’t even understand because they hadn’t become engaged with that part of the problem.
I’m not certain that this must happen in reality. The problem might have much kinder qualities than I anticipate in the sense of mistakes naturally showing up early enough and blatantly enough for corner-cutters to spot them. But it’s how things are looking as a default after becoming engaged with the problems of CEV-class AI. The same problems show up in proposed ‘genies’ too, it’s just that the genie-proposers don’t realize it.
I’m… not sure what you mean by this. And I wouldn’t be against putting a whole CEV-ish human morality in an AI, either. My point is that there seems to be a big space between your Outcome Pump fail example and highly paternalistic AIs of the sort that caused Failed Utopia 4-2.
It reminds me a little bit about how modern computers are only occasionally used for computation.
Anything smarter-than-human should be regarded as containing unimaginably huge forces held in check only by the balanced internal structure of those forces, since there is nothing which could resist them if unleashed. The degree of ‘obedience’ makes very little difference to this fact, which must be dealt with before you can go on to anything else.
As I understand it, a AI is expected to make huge, inventive efforts to fulfill its orders as it understands them.
You know how sometimes people cause havoc while meaning well? Imagine something immensely more powerful and probably less clueful making the same mistake.
Besides Eliezer’s rather strong-looking argument, ethically creating Obedient AI would require solving the following scary problems:
A “nonperson predicate” that can ensure the AI doesn’t create simulations which themselves count as people. If we fail to solve this one, then I could be a simulation the AI made in order to test how people like me react to torture.
A way to ensure the AI itself does not count as a person, so that we don’t feel sad if it eventually switches itself off. See here for a fuller explanation of why this matters.
Now, I think Wei Dai suggested we start by building a “philosophical” AI that could solve such problems for us. I don’t think philosophy is a natural class. (A ‘correct way to do philosophy’ sounds like a fully general correct way to think and act.) But if we get the AI’s goals right, then maybe it could start out restricted by flawed and overcautious answers to these questions, but find us some better answers. Maybe.
I am aware of the need for those things (part of what I mean by (need for friendliness in OAI) but as far as I can tell, Paternalistic FAI requires you to solve those problems, plus simple ‘not being very powerful but insane’, plus basic understandings of what matters to humans, plus incredibly meta human values matters. An OAI can leave off the last one of those problems.
I meant that by going meta we might not have to solve them fully.
All the problems you list sound nearly identical to me. In particular, “what matters to humans” sounds more vague but just as meta. If it includes enough details to actually reassure me, you could just tell the AI, “Do that.” Presumably what matters to us would include ‘the ability to affect our environment, eg by giving orders.’ What do you mean by “very powerful but insane”? I want to parse that as ‘intelligent in the sense of having accurate models that allow it to shape the future, but not programmed to do what matters to humans.’
“very powerful but insane” : AI’s response to orders seem to make less than no sense, yet AI is still able to do damage.
“What matters to humans”: Things like the Outcome Pump example, where any child would know that not dying is supposed to be part of “out of the building”, but not including the problems that we are bad at solving, such as fun theory and the like.
Why the huge focus on fully paternalistic Friendly AI rather than Obedient AI? It seems like a much lower-risk project. (and yes, I’m aware of the need for Friendliness in Obedient AI.)
Because the AI is better at estimating the consequences of following an order than the person giving the order.
There also the issue that the AI is likely to act in a way that changes the order that the person gives if it’s own utility criteria are about fulfilling orders.
Also, even assuming a “right” way of making obedient FAI is found (for example, one that warns you if you’re asking for something that might bite you in the ass later), there remains the problem of who is allowed to give orders to the AI. Power corrupts, etc.
What’s with the ems? People who are into ems seem to make a lot of assumptions about what ems are like and seem completely unattached to present-day culture or even structure of life, seem willing to spam duplicates of people around, etc. I know that Hanson thinks that 1. ems will not be robbed of their humanity and 2. that lots of things we currently consider horrible will come to pass and be accepted, but it’s rather strange just how as soon as people say ‘em’ (as opposed to any other form of uploading) everything gets weird. Does anthropics come into it?
Why the huge focus on fully paternalistic Friendly AI rather than Obedient AI? It seems like a much lower-risk project. (and yes, I’m aware of the need for Friendliness in Obedient AI.)
For what it’s worth, Eliezer’s answer to your second question is here:
Is that true? Why can’t the wish point at what it wants (e.g. the wishes of particular human X) - rather than spelling it out in detail?
The first problem is the wish would have to be extremely good at pointing.
This sounds silly but what I mean is that humans are COMPLICATED. “Pointing” at a human and telling an AI to deduce things about it will come up with HUGE swathes of data which you have to have already prepared it to ignore or pay attention to. To give a classic simple example, smiles are a sign of happiness but we do not want to tile the universe in smiley faces or create an artificial virus that constricts your face into a rictus and is highly contagious.
Second: assuming that works, it works primarily for one person, which is giving that person a lot more power than I think most people want to give any one person. But if we could guarantee an AI would fulfill the values of A person rather than of multiple people and someone else was developing AI that wasn’t guarunteed to fulfill any values I’d probably take it.
To spell out some of the complications—does the genie only respond to verbal commands? What if the human is temporarily angry at someone and an internal part of their brain wishes them harm. The genie needs to know not to act on this. So it must have some kind of requirement for reflective equilibrium.
Suppose the human is duped into pursuing some unwise course of action? The genie needs to reject their new wishes. But the human should still be able to have their morality evolve over time.
So you still need a complete CV Extrapolator. But maybe that’s what you had in mind be pointing at the wishes of a particular human?
I think that Obedient AI requires less fragility-of-values types of things.
I don’t see why a genie can’t kill you just as hard by missing one dimension of what it meant to satisfy your wish.
I’m not talking naive obedient AI here. I’m talking a much less meta FAI that does not do analysis of metaethics or CEV or do incredibly vague, subtle wishes. (Atlantis in HPMOR may be an example of a very weak, rather irrational, poorly safeguarded Obedient AI with a very, very strange command set.)
Basically it’s a matter of natural selection. Given a starting population of EMs, if some are unwilling to be copied, the ones that are willing to be copied will dominate the population in short order. If EMs are useful for work, eg valuable, then the more valuable ones will be copied more often. At that point, EMs that are willing to be copied and do slave labor effectively for no complaints will become the most copied, and the population of ems will end up being composed largely of copies of the person/people who are 1) ok with being copied, 2) ok with being modified to work more effectively.
What question are you answering?
Question 1, especially sentences 2-3. How to tell: the answer is highly appropriate and correct if taken as a response to that.
with an answer
Doesn’t make too much sense. Not to mention that those are not questions.
The questions in the post are
What’s with the ems?
Does anthropics come into it?
Why the huge focus on fully paternalistic Friendly AI rather than Obedient AI?
At least to me the answer doesn’t seem to fit any of those but I guess the community highly disagrees with me given the upvotes and downvotes.
Question 1 doesn’t end with the question mark; the next two sentences explain the intention of asking “what’s with the ems?” Which would otherwise be a hopelessly vague question, but becomes clear enough with the help. Charitable interpretation trumps exact punctuation as an interpretive guide.
Well, no offense, but I’m not sure you are aware of the need for Friendliness in Obedient AI, or rather, just how much F you need in a genie.
If you were to actually figure out how to build a genie you would have figured it out by trying to build a CEV-class AI, intending to tackle all those challenges, tackling all those challenges, having pretty good solutions to all of those challenges, not trusting those solutions quite enough, and temporarily retreating to a mere genie which had ALL of the safety measures one would intuitively imagine necessary for a CEV-class independently-acting unchecked AI, to the best grade you could currently implement them. Anyone who thought they could skip the hard parts of CEV-class FAI by just building a genie instead, would die like a squirrel under a lawnmower. For reasons they didn’t even understand because they hadn’t become engaged with that part of the problem.
I’m not certain that this must happen in reality. The problem might have much kinder qualities than I anticipate in the sense of mistakes naturally showing up early enough and blatantly enough for corner-cutters to spot them. But it’s how things are looking as a default after becoming engaged with the problems of CEV-class AI. The same problems show up in proposed ‘genies’ too, it’s just that the genie-proposers don’t realize it.
I’m… not sure what you mean by this. And I wouldn’t be against putting a whole CEV-ish human morality in an AI, either. My point is that there seems to be a big space between your Outcome Pump fail example and highly paternalistic AIs of the sort that caused Failed Utopia 4-2.
It reminds me a little bit about how modern computers are only occasionally used for computation.
Anything smarter-than-human should be regarded as containing unimaginably huge forces held in check only by the balanced internal structure of those forces, since there is nothing which could resist them if unleashed. The degree of ‘obedience’ makes very little difference to this fact, which must be dealt with before you can go on to anything else.
As I understand it, a AI is expected to make huge, inventive efforts to fulfill its orders as it understands them.
You know how sometimes people cause havoc while meaning well? Imagine something immensely more powerful and probably less clueful making the same mistake.
I don’t know whether Hanson has a concret concept of ‘humanity’.
Besides Eliezer’s rather strong-looking argument, ethically creating Obedient AI would require solving the following scary problems:
A “nonperson predicate” that can ensure the AI doesn’t create simulations which themselves count as people. If we fail to solve this one, then I could be a simulation the AI made in order to test how people like me react to torture.
A way to ensure the AI itself does not count as a person, so that we don’t feel sad if it eventually switches itself off. See here for a fuller explanation of why this matters.
Now, I think Wei Dai suggested we start by building a “philosophical” AI that could solve such problems for us. I don’t think philosophy is a natural class. (A ‘correct way to do philosophy’ sounds like a fully general correct way to think and act.) But if we get the AI’s goals right, then maybe it could start out restricted by flawed and overcautious answers to these questions, but find us some better answers. Maybe.
I am aware of the need for those things (part of what I mean by (need for friendliness in OAI) but as far as I can tell, Paternalistic FAI requires you to solve those problems, plus simple ‘not being very powerful but insane’, plus basic understandings of what matters to humans, plus incredibly meta human values matters. An OAI can leave off the last one of those problems.
I meant that by going meta we might not have to solve them fully.
All the problems you list sound nearly identical to me. In particular, “what matters to humans” sounds more vague but just as meta. If it includes enough details to actually reassure me, you could just tell the AI, “Do that.” Presumably what matters to us would include ‘the ability to affect our environment, eg by giving orders.’ What do you mean by “very powerful but insane”? I want to parse that as ‘intelligent in the sense of having accurate models that allow it to shape the future, but not programmed to do what matters to humans.’
“very powerful but insane” : AI’s response to orders seem to make less than no sense, yet AI is still able to do damage. “What matters to humans”: Things like the Outcome Pump example, where any child would know that not dying is supposed to be part of “out of the building”, but not including the problems that we are bad at solving, such as fun theory and the like.
I didn’t know “em” was a specific form of uploading. What form is it, and what other forms are there?
Because the AI is better at estimating the consequences of following an order than the person giving the order.
There also the issue that the AI is likely to act in a way that changes the order that the person gives if it’s own utility criteria are about fulfilling orders.
Also, even assuming a “right” way of making obedient FAI is found (for example, one that warns you if you’re asking for something that might bite you in the ass later), there remains the problem of who is allowed to give orders to the AI. Power corrupts, etc.
We can make more soild predictions about ems than we can about strong AI since there are less black swans regarding ems to mess up our calculations.
No.