I don’t understand your answer. Let’s try again. If “something like CEV” is what you want to implement, then an AI pointed at your volition will derive and implement CEV, so you don’t need to specify it in detail beforehand. If CEV isn’t what you want to implement, then why are you implementing it? Assume all your altruistic considerations, etc., are already folded into the definition of “you want”—just like a whole lot of other stuff-to-be-inferred is folded into the definition of CEV.
ETA: your “don’t be evil” looks like a confusion of levels to me. If you don’t want to be evil, there’s already a term for that in your volition—no need to add any extra precautions.
If CEV isn’t what you want to implement, then why are you implementing it?
The sane answer is that it solves a cooperation problem. ie. People will not kill you for trying it and may instead donate money. As we can see here this is not the position that Eliezer seems to take. He goes for the ‘signal naive morality via incomprehension’ approach.
People will not kill you for trying it and may instead donate money.
I do not think this would work. Take the viewpoint of a government. What does CEV do? It does deprive them of some amount of ultimate power. The only chance I see to implement CEV using an AI going FOOM is either secretly or due to the fact that nobody takes you serious enough. Both routes are rather unlikely. Military analysis of LW seems to be happening right now. And if no huge unforeseeable step towards AGI happens, it will move forward gradually enough for governments (or other groups), who already investigate LW and the SIAI, to notice and take measures to disable anyone trying to implement CEV.
The problem is that once CEV becomes feasible, governments will consider anyone working on it as an attempted coup. Regardless of the fact that the people involved might not perceive it to be politics, working on a CEV is indeed an highly political activity. At least this will be the viewpoint of many who do not understand CEV or oppose it for different reasons.
Pardon me. To be more technically precise: “Implementing an AI that extrapolates the volition of something other or broader than yourself may facilitate cooperation. It would reduce the chance that people will kill you for the attempt and increase the chance of receiving support.”
Seen this? Anyway, I feel that it is really hard to tackle this topic because of its vagueness. As multifoliaterose implied here, at the moment the task to recognize humans as distinguished beings already seems to me too broad a problem to tackle directly. Talking about implementing CEV indirectly, by derivation from Yudkowsky’s mind, versus specifying the details beforehand, seems to be fun to argue but ultimately ineffective at this point. In other words, an organisation that claims to solve some meta problem by means of CEV is only slightly different from one proclaiming to make use of magic. I’d be much more comfortable to donate to a decision theory workshop for example.
I digress, but I thought I should clarify some of my intention for always getting into discussions involving the SIAI. It is highly interesting, sociological I suppose. On the one hand people take this topic very serious, the most important topic indeed, yet seem to be very relaxed about the only organisation involved in shaping the universe. There is simply no talk about more transparency to prove the effectiveness of the SIAI and its objectives. Further, without transparency you simply cannot conclude that because someone writes a lot of ethical correct articles and papers that that output is reflective of their true goals. Also people don’t seem to be worried very much about all the vagueness involved here, as this post proves once again. Where is the progress that would justify further donations? As I said, I digress. Excuse me but this topic is the most fascinating issue for me on LW.
Back to your comment, it makes sense. Surely if you tell people to also take care of what they want, they’ll be less opposed than if you told them that you’ll just do what you want because you want to make them happy. Yet there will be those who don’t want you to do it, regardless of wanting to make them happy. There will be those who only want you to implement their personal volition. So whenever CEV will be taken serious it will become really hard to implement it, because people will get mad about it, really mad. People already oppose small-impact policies just because it’s the other party that is trying to implement them. What will they do if one person or organisation tries to implement a policy for the whole universe and the rest of infinity?
There is simply no demand for more transparency to prove the effectiveness of the SIAI and its objectives.
Are you sure? I imagine there are many people interested in evaluating the effectiveness of the SIAI. At least I am, and from the small number of real discussions I have had about the SIAI’s project I extrapolate that uncertainty is the main inhibitor of enthusiasm (although of course if the uncertainty was removed this may create more fundamental problems).
The counterargument I’ve read in earlier (“unreal”) discussions on the subject is, roughly, that people who claim their support for SIAI is contingent on additional facts, analyses, or whatever are simply wrong… that whatever additional data is provided along those lines won’t actually convince them, it will merely cause them to ask for different data.
While the unpleasant readings are certainly readily available, more neutral readings are available as well.
By way of analogy: it’s a common relationship trope that suitors who insist on proof of my love and fidelity won’t be satisfied with any proofs I can provide. OTOH, it’s also a common trope that suitors who insist that I should trust in their love and fidelity without evidence don’t have them to offer in the first place.
If people who ask me a certain type of question aren’t satisfied with the answer I have, I can either look for different answers or for different people; which strategy I pick depends on the specifics of the situation. If I want to infer something about someone else based on their choice of strategy I similarly have to look into the specifics of the situation. IME there is no royal road to the right answer here.
Yes, absolutely, I read your comment as understatement… but if you meant it literally, I’m curious as to the whole context of your comment.
For example, what do you mean to contrast that counterargument with? That is: what’s an example of an argument for which the motives for assuming it are actively pleasant? What follows from their pleasantness?
That is: what’s an example of an argument for which the motives for assuming it are actively pleasant? What follows from their pleasantness?
A policy like “assume good faith” strikes me as coming from not unpleasant motives. What follows is that you should attribute a higher probability of good faith to someone who assumes good faith. If someone assumes that other people cannot be convinced by evidence, my knowledge of projection suggests that should increase my probability estimate that they cannot be convinced by evidence.
That doesn’t entirely answer your question- since I talked about policies and you’re talking about motives- but it should suggest an answer. Policies and statements represent a distribution of sets of possible motives, and so while the motives themselves unambiguously tell you how to respond the policies just suggest good guesses. But, in general, pleasantness begets pleasantness and unpleasantness begets unpleasantness.
There are a large number of possible motives that could lead to assuming that the people in question are simply wrong. None of them are particularly pleasant (but not all of them are unpleasant). I don’t need to know which motivates them in order to make the statement I made. However, the statement as paraphrased by TheOtherDave is much more specific; hence the difficulty.
As a more general comment, I strongly approve of people kicking tires, even if they’re mine. When I see someone who doesn’t have similar feelings, I can’t help but wonder why. Like with my earlier comment, not all the reasons are unpleasant. But some are.
Please read this comment. It further explains why I actually believe that transparency is important to prove the effectiveness of the SIAI. I also edited my comment above. I seem to have messed up on correcting some grammatical mistakes. It originally said, there is simply no talk about more transparency....
On the one hand people take this topic very serious, the most important topic indeed, yet seem to be very relaxed about the only organisation involved in shaping the universe.
“The only organisation involved in shaping the universe”?!? WTF? These folks have precious little in terms of resources. They apparently haven’t even started coding yet. You yourself assign them a miniscule chance of succeeding at their project. How could they possibly be the “the only organisation involved in shaping the universe”?!?
Really? Even if they were working on a merely difficult problem, you would expect coding to be the very last step of the project. People don’t solve hard algorithmic problems by writing some code and seeing what happens. I wouldn’t expect an organization working optimally on AGI to write any code until after making some remarkable progress on the problem.
How could they possibly be the “the only organisation involved in shaping the universe”?!?
There could easily be no organization at all trying to deliberately control the long-term future of the human race; we’d just get whatever we happened to stumble into. You are certainly correct that there are many, many organizations which are involved in shaping our future; they just rarely think about the really long-term effects (I think this is what XiXiDu meant).
Really? Even if they were working on a merely difficult problem, you would expect coding to be the very last step of the project. People don’t solve hard algorithmic problems by writing some code and seeing what happens. I wouldn’t expect an organization working optimally on AGI to write any code until after making some remarkable progress on the problem.
IMO, there’s a pretty good chance of an existing organisation being involved with getting there first. The main problem with not having any working products is that it is challenging to accumulate resources—which are needed to hire researchers and programmers—which you need to fuel your self-improvement cycle.
Google, hedge funds, and security agencies have their self-improvement cycle already rolling—they are evidently getting better and better as time passes. That results in accumulated resources, which can be used to drive further development.
If you were a search company who aimed directly at a human-level search agent, you are now up against a gorilla with an android army who already has most of the pieces of the puzzle. Waiting until you have done all the relevant R+D is just not how software development works. You get up and running as fast as you can—or else someone else does that first—and eats your lunch.
So whenever CEV will be taken serious it will become really hard to implement it, because people will get mad about it, really mad. People already oppose small-impact policies just because it’s the other party that is trying to implement them. What will they do if one person or organisation tries to implement a policy for the whole universe and the rest of infinity?
Right—but this seems as though it isn’t how things are likely to go down. CEV is a pie-in-the-sky wishlist—not an engineering proposal. Those attempting to directly implement things like it seem practically guaranteed to get to the plate last. For example Ben’s related proposal involved “non-invasive” scanning of the human brain. That just isn’t technology we will get before we have sophisticated machine intelligence, I figure. So: either the proposals will be adjusted so they are more practical en route—or else, the proponents will just fail.
Most likely there will be an extended stage where people tell the machines what to do—much as Asimov suggested. The machines will “extrapolate” in much the same way that Google Instant “extrapolates”—and the human wishes will “cohere”—to the extent that large-scale measures in society encourage cooperation.
Here is the original comment. It wasn’t my intention to say that, it originally said there is simply no talk about more transparency.… I must have messed up on correcting some mistakes.
That is more-or-less my own analysis. Notoriously:
Politics is the gentle art of getting votes from the poor and campaign funds from the rich by promising to protect each from the other.
CEV may get some the votes from the poor—but offers precious little to the rich. Since those are the folk who are running the whole show, it is hard to see how they will approve it. They won’t approve it—there isn’t anything in it for them. So, I figure, the plan is probably pretty screwed—the hopeful plan of a bunch of criminal (their machine has no respect for the law!) and terrorist (if they can make it stick!) outlaws—who dream of overthrowing their own government.
The sane answer is that it solves a cooperation problem.
Reciprocal altruism sometimes sends a relatively weak signal—it says that you will cooperate so long as the “shadow of the future” is not too ominous.
Invoking “good” and “evil” signals more that you believe in moral absolutes: the forces of good and evil.
On the one hand, that is a stronger signalling technique—it attempts to signal that you won’t defect—no matter what!
On the other hand, it makes you look a bit as though you are crazy, don’t understand rationality or game theory—and this can make your behaviour harder to model.
As with most signalling, it should be costly to be credible. Alas, practically anyone can rattle on about good and evil. I am not convinced it is very effective overall.
then an AI pointed at your volition will derive and implement CEV
Also, from OP:
Why must this particular thing be spelled out in a document like CEV and not left to the mysterious magic of “intelligence”, and what other such things are there?
If what you want is to have something pointed at your volition then you first have to design the AI that points to it rather than something else. This whole CEV stuff was an attempt at answering the “design an AI that points to it” question, and the crucial consideration that led to it was that there is no magically intelligent system that would automatically converge to what we’d prefer. Of course, there remains the question of balance between AI structure determined by what I want and AI structure determined by what the AI thinks I want. The realization of FAI is that you cannot eliminate the first item from the balance and get an acceptable result. It is better to ask “How could I best solve the FAI problem using my brain rather than something else?” than to ask “Could I use something else than my brain to solve the FAI problem?”.
If CEV isn’t what you want to implement, then why are you implementing it?
If a CEV isn’t what I want to implement, it is still good to implement CEV because it’ll find out what I want to implement—plus more stuff that I would agree to implementing but not think of in the first place.
ETA: your “don’t be evil” looks like a confusion of levels to me. If you don’t want to be evil, there’s already a term for that in your volition—no need to add any extra precautions.
Eliezer didn’t realize that you meant his own personal CEV, rather than his current incoherent, unextrapolated volition.
I don’t understand your answer. Let’s try again. If “something like CEV” is what you want to implement, then an AI pointed at your volition will derive and implement CEV, so you don’t need to specify it in detail beforehand. If CEV isn’t what you want to implement, then why are you implementing it? Assume all your altruistic considerations, etc., are already folded into the definition of “you want”—just like a whole lot of other stuff-to-be-inferred is folded into the definition of CEV.
ETA: your “don’t be evil” looks like a confusion of levels to me. If you don’t want to be evil, there’s already a term for that in your volition—no need to add any extra precautions.
The sane answer is that it solves a cooperation problem. ie. People will not kill you for trying it and may instead donate money. As we can see here this is not the position that Eliezer seems to take. He goes for the ‘signal naive morality via incomprehension’ approach.
I do not think this would work. Take the viewpoint of a government. What does CEV do? It does deprive them of some amount of ultimate power. The only chance I see to implement CEV using an AI going FOOM is either secretly or due to the fact that nobody takes you serious enough. Both routes are rather unlikely. Military analysis of LW seems to be happening right now. And if no huge unforeseeable step towards AGI happens, it will move forward gradually enough for governments (or other groups), who already investigate LW and the SIAI, to notice and take measures to disable anyone trying to implement CEV.
The problem is that once CEV becomes feasible, governments will consider anyone working on it as an attempted coup. Regardless of the fact that the people involved might not perceive it to be politics, working on a CEV is indeed an highly political activity. At least this will be the viewpoint of many who do not understand CEV or oppose it for different reasons.
Pardon me. To be more technically precise: “Implementing an AI that extrapolates the volition of something other or broader than yourself may facilitate cooperation. It would reduce the chance that people will kill you for the attempt and increase the chance of receiving support.”
Aha, I see. My mistake, ignoring the larger context.
Seen this? Anyway, I feel that it is really hard to tackle this topic because of its vagueness. As multifoliaterose implied here, at the moment the task to recognize humans as distinguished beings already seems to me too broad a problem to tackle directly. Talking about implementing CEV indirectly, by derivation from Yudkowsky’s mind, versus specifying the details beforehand, seems to be fun to argue but ultimately ineffective at this point. In other words, an organisation that claims to solve some meta problem by means of CEV is only slightly different from one proclaiming to make use of magic. I’d be much more comfortable to donate to a decision theory workshop for example.
I digress, but I thought I should clarify some of my intention for always getting into discussions involving the SIAI. It is highly interesting, sociological I suppose. On the one hand people take this topic very serious, the most important topic indeed, yet seem to be very relaxed about the only organisation involved in shaping the universe. There is simply no talk about more transparency to prove the effectiveness of the SIAI and its objectives. Further, without transparency you simply cannot conclude that because someone writes a lot of ethical correct articles and papers that that output is reflective of their true goals. Also people don’t seem to be worried very much about all the vagueness involved here, as this post proves once again. Where is the progress that would justify further donations? As I said, I digress. Excuse me but this topic is the most fascinating issue for me on LW.
Back to your comment, it makes sense. Surely if you tell people to also take care of what they want, they’ll be less opposed than if you told them that you’ll just do what you want because you want to make them happy. Yet there will be those who don’t want you to do it, regardless of wanting to make them happy. There will be those who only want you to implement their personal volition. So whenever CEV will be taken serious it will become really hard to implement it, because people will get mad about it, really mad. People already oppose small-impact policies just because it’s the other party that is trying to implement them. What will they do if one person or organisation tries to implement a policy for the whole universe and the rest of infinity?
Are you sure? I imagine there are many people interested in evaluating the effectiveness of the SIAI. At least I am, and from the small number of real discussions I have had about the SIAI’s project I extrapolate that uncertainty is the main inhibitor of enthusiasm (although of course if the uncertainty was removed this may create more fundamental problems).
The counterargument I’ve read in earlier (“unreal”) discussions on the subject is, roughly, that people who claim their support for SIAI is contingent on additional facts, analyses, or whatever are simply wrong… that whatever additional data is provided along those lines won’t actually convince them, it will merely cause them to ask for different data.
I assume you’re referring to Is That Your True Rejection?.
(nods) I think so, yes.
This strikes me as a difficult thing to know, and the motives that lead to assuming it are not particularly pleasant.
While the unpleasant readings are certainly readily available, more neutral readings are available as well.
By way of analogy: it’s a common relationship trope that suitors who insist on proof of my love and fidelity won’t be satisfied with any proofs I can provide. OTOH, it’s also a common trope that suitors who insist that I should trust in their love and fidelity without evidence don’t have them to offer in the first place.
If people who ask me a certain type of question aren’t satisfied with the answer I have, I can either look for different answers or for different people; which strategy I pick depends on the specifics of the situation. If I want to infer something about someone else based on their choice of strategy I similarly have to look into the specifics of the situation. IME there is no royal road to the right answer here.
It is a shame that understatement is so common it’s hard to be precise quickly; I meant to include neutral readings in “not particularly pleasant.”
Huh. Interesting.
Yes, absolutely, I read your comment as understatement… but if you meant it literally, I’m curious as to the whole context of your comment.
For example, what do you mean to contrast that counterargument with? That is: what’s an example of an argument for which the motives for assuming it are actively pleasant? What follows from their pleasantness?
A policy like “assume good faith” strikes me as coming from not unpleasant motives. What follows is that you should attribute a higher probability of good faith to someone who assumes good faith. If someone assumes that other people cannot be convinced by evidence, my knowledge of projection suggests that should increase my probability estimate that they cannot be convinced by evidence.
That doesn’t entirely answer your question- since I talked about policies and you’re talking about motives- but it should suggest an answer. Policies and statements represent a distribution of sets of possible motives, and so while the motives themselves unambiguously tell you how to respond the policies just suggest good guesses. But, in general, pleasantness begets pleasantness and unpleasantness begets unpleasantness.
It strikes me as a tendency that can either be observed as a trend or noted to be absent.
This strikes meas a difficult thing to know. And distastefully ironic.
There are a large number of possible motives that could lead to assuming that the people in question are simply wrong. None of them are particularly pleasant (but not all of them are unpleasant). I don’t need to know which motivates them in order to make the statement I made. However, the statement as paraphrased by TheOtherDave is much more specific; hence the difficulty.
As a more general comment, I strongly approve of people kicking tires, even if they’re mine. When I see someone who doesn’t have similar feelings, I can’t help but wonder why. Like with my earlier comment, not all the reasons are unpleasant. But some are.
Please read this comment. It further explains why I actually believe that transparency is important to prove the effectiveness of the SIAI. I also edited my comment above. I seem to have messed up on correcting some grammatical mistakes. It originally said, there is simply no talk about more transparency....
I didn’t intend to write that. I don’t know what happened there.
“The only organisation involved in shaping the universe”?!? WTF? These folks have precious little in terms of resources. They apparently haven’t even started coding yet. You yourself assign them a miniscule chance of succeeding at their project. How could they possibly be the “the only organisation involved in shaping the universe”?!?
Really? Even if they were working on a merely difficult problem, you would expect coding to be the very last step of the project. People don’t solve hard algorithmic problems by writing some code and seeing what happens. I wouldn’t expect an organization working optimally on AGI to write any code until after making some remarkable progress on the problem.
There could easily be no organization at all trying to deliberately control the long-term future of the human race; we’d just get whatever we happened to stumble into. You are certainly correct that there are many, many organizations which are involved in shaping our future; they just rarely think about the really long-term effects (I think this is what XiXiDu meant).
IMO, there’s a pretty good chance of an existing organisation being involved with getting there first. The main problem with not having any working products is that it is challenging to accumulate resources—which are needed to hire researchers and programmers—which you need to fuel your self-improvement cycle.
Google, hedge funds, and security agencies have their self-improvement cycle already rolling—they are evidently getting better and better as time passes. That results in accumulated resources, which can be used to drive further development.
If you were a search company who aimed directly at a human-level search agent, you are now up against a gorilla with an android army who already has most of the pieces of the puzzle. Waiting until you have done all the relevant R+D is just not how software development works. You get up and running as fast as you can—or else someone else does that first—and eats your lunch.
Right—but this seems as though it isn’t how things are likely to go down. CEV is a pie-in-the-sky wishlist—not an engineering proposal. Those attempting to directly implement things like it seem practically guaranteed to get to the plate last. For example Ben’s related proposal involved “non-invasive” scanning of the human brain. That just isn’t technology we will get before we have sophisticated machine intelligence, I figure. So: either the proposals will be adjusted so they are more practical en route—or else, the proponents will just fail.
Most likely there will be an extended stage where people tell the machines what to do—much as Asimov suggested. The machines will “extrapolate” in much the same way that Google Instant “extrapolates”—and the human wishes will “cohere”—to the extent that large-scale measures in society encourage cooperation.
FWIW, I mostly gave up on them a while back. As a spectator, I mostly look on, grimacing, while wondering whether there are any salvage opportunities.
Here is the original comment. It wasn’t my intention to say that, it originally said there is simply no talk about more transparency.… I must have messed up on correcting some mistakes.
I just copied-and-pasted verbatim. However the current edit does seem to make more sense.
That is more-or-less my own analysis. Notoriously:
CEV may get some the votes from the poor—but offers precious little to the rich. Since those are the folk who are running the whole show, it is hard to see how they will approve it. They won’t approve it—there isn’t anything in it for them. So, I figure, the plan is probably pretty screwed—the hopeful plan of a bunch of criminal (their machine has no respect for the law!) and terrorist (if they can make it stick!) outlaws—who dream of overthrowing their own government.
Awesome comment, thanks. I’m going to think wishfully and take that as SIAI’s answer.
Reciprocal altruism sometimes sends a relatively weak signal—it says that you will cooperate so long as the “shadow of the future” is not too ominous.
Invoking “good” and “evil” signals more that you believe in moral absolutes: the forces of good and evil.
On the one hand, that is a stronger signalling technique—it attempts to signal that you won’t defect—no matter what!
On the other hand, it makes you look a bit as though you are crazy, don’t understand rationality or game theory—and this can make your behaviour harder to model.
As with most signalling, it should be costly to be credible. Alas, practically anyone can rattle on about good and evil. I am not convinced it is very effective overall.
Also, from OP:
If what you want is to have something pointed at your volition then you first have to design the AI that points to it rather than something else. This whole CEV stuff was an attempt at answering the “design an AI that points to it” question, and the crucial consideration that led to it was that there is no magically intelligent system that would automatically converge to what we’d prefer. Of course, there remains the question of balance between AI structure determined by what I want and AI structure determined by what the AI thinks I want. The realization of FAI is that you cannot eliminate the first item from the balance and get an acceptable result. It is better to ask “How could I best solve the FAI problem using my brain rather than something else?” than to ask “Could I use something else than my brain to solve the FAI problem?”.
If a CEV isn’t what I want to implement, it is still good to implement CEV because it’ll find out what I want to implement—plus more stuff that I would agree to implementing but not think of in the first place.
Eliezer didn’t realize that you meant his own personal CEV, rather than his current incoherent, unextrapolated volition.