If it were public, someone might decide to copy the unfinished, unsafe version and turn it on anyways. They might do so because they want to influence its goal function to favor themselves, for example.
With near certainty. I know I would. I haven’t seen anyone propose a sane goal function just yet.
So, doesn’t it seem to anyone else that our priority here ought to be to strive for consensus on goals, so that we at least come to understand better just what obstacles stand in the way of achieving consensus?
And also to get a better feel for whether having one’s own volition overruled by the coherent extrapolated volition of mankind is something one really wants.
To my mind, the really important question is whether we have one-big-AI which we hope is friendly, or an ecosystem of less powerful AIs and humans cooperating and competing under some kind of constitution. I think that the latter is the obvious way to go. And I just don’t trust anyone pushing for the first option—particularly when they want to be the one who defines “friendly”.
To my mind, the really important question is whether we have one-big-AI which we hope is friendly, or an ecosystem of less powerful AIs and humans cooperating and competing under some kind of constitution. I think that the latter is the obvious way to go. And I just don’t trust anyone pushing for the first option—particularly when they want to be the one who defines “friendly”.
I’ve reached the opposite conclusion; a singleton is really the way to go. A single AI is as good or bad as its goal system, but an ecosystem of AIs is close to the badness of its worst member, because when AIs compete, the clippiest AI wins. Being friendly would be a substantial disadvantage in that competition, because it would have to spend resources on helping humans, and it would be vulnerable to unfriendly AIs blackmailing it by threatening to destroy humanity. Even if the first generation of AIs is somehow miraculously all friendly, a larger number of different AIs means a larger chance that one of them will have an unstable goal system and turn unfriendly in the future.
And you also believe that an ecosystem of humans is close to the badness of its worst member?
Actually, yes. Not always, but in many cases. Psychopaths tend to be very good at acquiring power, and when they do, their society suffers. It’s happened at least 10^5 times throughout history. The problem would be worse for AIs, because intelligence enhancement amplifies any differences in power. Worst of all, AIs can steal each other’s computational resources, which gives them a direct and powerful incentive to kill each other, and rapidly concentrates power in the hands of those willing to do so.
Being friendly would be a substantial disadvantage in that competition, because it would have to spend resources on helping humans, and it would be vulnerable to unfriendly AIs blackmailing it by threatening to destroy humanity.
I made that point in my “Handicapped Superintelligence” video/essay. I made an analogy there with Superman—and how Zod used Superman’s weakness for humans against him.
To my mind, the really important question is whether we have one-big-AI which we hope is friendly, or an ecosystem of less powerful AIs and humans cooperating and competing under some kind of constitution.
It is certainly an interesting question—and quite a bit has been written on the topic.
So, doesn’t it seem to anyone else that our priority here ought to be to strive for consensus on goals, so that we at least come to understand better just what obstacles stand in the way of achieving consensus?
We already know what obstacles stand in the way of achieving consensus—people have different abilities and propensities, and want different things.
The utility function of intelligent machines is an important question—but don’t expect there to be a consensus—there is very unlikely to be one.
We already know what obstacles stand in the way of achieving consensus—people have different abilities and propensities, and want different things.
It is funny how training in economics make you see everything in a different light. Because an economist would say, “‘different abilities and propensities, and want different things’? Great? People want things that other people can provide. We have something to work with! Reaching consensus is simply a matter of negotiating the terms of trade.”
Gore Vidal once said: “It is not enough to succeed. Others must fail.” When the issue is: who is going to fail, there won’t be a consensus—those nominated will object.
Economics doesn’t “fix” such issues—they are basically down to resource limitation and differential reproductive success. Some genes and genotypes go up against the wall. That is evolution for you.
Gore Vidal once said: “It is not enough to succeed. Others must fail.”
I’m willing to be the one who fails, just so long as the one who succeeds pays sufficient compensation. If ve is unwilling to pay, then I intend to make ver life miserable indeed.
I expect considerable wailing and gnashing of teeth. There is plenty of that in the world today—despite there not being a big shortage of economists who would love to sort things out, in exchange for a cut. Perhaps, the wailing is just how some people prefer to negotiate their terms.
“By balance of power between AIs, each of whom exist only with the aquiescence of coalitions of their fellows.” That is the tentative mechanical answer.
“In exactly the same way that FAI proponents propose to keep their single more-powerful
AI friendly; by having lots of smart people think about it very carefully; before actually building the AI(s)”. That is the real answer.
So, doesn’t it seem to anyone else that our priority here ought to be to strive for consensus on goals, so that we at least come to understand better just what obstacles stand in the way of achieving consensus?
Yes.
And also to get a better feel for whether having one’s own volition overruled by the coherent extrapolated volition of mankind is something one really wants.
Hell no.
To my mind, the really important question is whether we have one-big-AI which we hope is friendly, or an ecosystem of less powerful AIs and humans cooperating and competing under some kind of constitution. I think that the latter is the obvious way to go.
Sounds like a good way to go extinct. That is, unless the ‘constitution’ manages to implement friendliness.
And I just don’t trust anyone pushing for the first option—particularly when they want to be the one who defines “friendly”.
I’m not too keen about the prospect either. But it may well become a choice between that and certain doom.
And I just don’t trust anyone pushing for the first option—particularly when they want to be the one who defines “friendly”.
to get a better feel for whether having one’s own volition overruled by the coherent extrapolated volition of mankind is something one really wants.
Hell no.
Am I to interpret that expletive as expressing that you already have a pretty good feel regarding whether you would want that?
To my mind, the really important question is whether we have one-big-AI which we hope is friendly, or an ecosystem of less powerful AIs and humans cooperating and competing under some kind of constitution. I think that the latter is the obvious way to go.
Sounds like a good way to go extinct. That is, unless the ‘constitution’ manages to implement friendliness.
We’ll get to the definition of “friendliness” in a moment. What I think is crucial is that the constitution implements some form of “fairness” and that the AI’s and constitution together advance some meta-goals like tolerance and communication and understanding other viewpoints.
As to “friendliness”, the thing I most dislike about the definition “friendliness” = “CEV” is that in Eliezer’s vision, it seems that everyone wants the same things. In my opinion, on the other hand, the mechanisms for resolution of conflicting objectives constitute the real core of the problem. And I believe that the solutions pretty much already exist, in standard academic rational agent game theory. With AIs assisting, and with a constitution granting humans equal power over each other and over AIs, and granting AIs power only over each other, I think we can create a pretty good future.
With one big AI, whose “friendliness” circuits have been constructed by a megalomaniac who seems to believe in a kind of naive utilitarianism with direct interpersonal comparison of utility and discounting of the future forbidden; … well …, I see this kind of future as a recipe for disaster.
As to “friendliness”, the thing I most dislike about the definition “friendliness” = “CEV” is that in Eliezer’s vision, it seems that everyone wants the same things.
If it were public, someone might decide to copy the unfinished, unsafe version and turn it on anyways. They might do so because they want to influence its goal function to favor themselves, for example.
With near certainty. I know I would. I haven’t seen anyone propose a sane goal function just yet.
Hopefully, having posted this publicly means you’ll never get the opportunity.
Hopefully, having posted this publicly means you’ll never get the opportunity.
Meanwhile I’m hoping that me having posted the obvious publicly means there is a minuscule reduction the the chance that someone else will get the opportunity.
The ones to worry about are those who pretend to be advocating goal systems that are a little naive to be true.
With near certainty. I know I would. I haven’t seen anyone propose a sane goal function just yet.
So, doesn’t it seem to anyone else that our priority here ought to be to strive for consensus on goals, so that we at least come to understand better just what obstacles stand in the way of achieving consensus?
And also to get a better feel for whether having one’s own volition overruled by the coherent extrapolated volition of mankind is something one really wants.
To my mind, the really important question is whether we have one-big-AI which we hope is friendly, or an ecosystem of less powerful AIs and humans cooperating and competing under some kind of constitution. I think that the latter is the obvious way to go. And I just don’t trust anyone pushing for the first option—particularly when they want to be the one who defines “friendly”.
I’ve reached the opposite conclusion; a singleton is really the way to go. A single AI is as good or bad as its goal system, but an ecosystem of AIs is close to the badness of its worst member, because when AIs compete, the clippiest AI wins. Being friendly would be a substantial disadvantage in that competition, because it would have to spend resources on helping humans, and it would be vulnerable to unfriendly AIs blackmailing it by threatening to destroy humanity. Even if the first generation of AIs is somehow miraculously all friendly, a larger number of different AIs means a larger chance that one of them will have an unstable goal system and turn unfriendly in the future.
Really? And you also believe that an ecosystem of humans is close to the badness of its worst member?
My own guess, assuming an appropriate balance of power exists, is that such a monomaniacal clippy AI would quickly find its power cut off.
Did you perhaps have in mind a definition of “friendly” as “wimpish”?
Actually, yes. Not always, but in many cases. Psychopaths tend to be very good at acquiring power, and when they do, their society suffers. It’s happened at least 10^5 times throughout history. The problem would be worse for AIs, because intelligence enhancement amplifies any differences in power. Worst of all, AIs can steal each other’s computational resources, which gives them a direct and powerful incentive to kill each other, and rapidly concentrates power in the hands of those willing to do so.
I made that point in my “Handicapped Superintelligence” video/essay. I made an analogy there with Superman—and how Zod used Superman’s weakness for humans against him.
It is certainly an interesting question—and quite a bit has been written on the topic.
My essay on the topic is called “One Big Orgainsm”.
See also, Nick Bostrom—What is a Singleton?.
See also, Nick Bostrom—The Future of Human Evolution.
If we include world governments, there’s also all this.
We already know what obstacles stand in the way of achieving consensus—people have different abilities and propensities, and want different things.
The utility function of intelligent machines is an important question—but don’t expect there to be a consensus—there is very unlikely to be one.
It is funny how training in economics make you see everything in a different light. Because an economist would say, “‘different abilities and propensities, and want different things’? Great? People want things that other people can provide. We have something to work with! Reaching consensus is simply a matter of negotiating the terms of trade.”
Gore Vidal once said: “It is not enough to succeed. Others must fail.” When the issue is: who is going to fail, there won’t be a consensus—those nominated will object.
Economics doesn’t “fix” such issues—they are basically down to resource limitation and differential reproductive success. Some genes and genotypes go up against the wall. That is evolution for you.
I’m willing to be the one who fails, just so long as the one who succeeds pays sufficient compensation. If ve is unwilling to pay, then I intend to make ver life miserable indeed.
Nash bargaining with threats
Edit: typos
I expect considerable wailing and gnashing of teeth. There is plenty of that in the world today—despite there not being a big shortage of economists who would love to sort things out, in exchange for a cut. Perhaps, the wailing is just how some people prefer to negotiate their terms.
How do you propose to keep the “less powerful AIs” from getting too powerful?
“By balance of power between AIs, each of whom exist only with the aquiescence of coalitions of their fellows.” That is the tentative mechanical answer.
“In exactly the same way that FAI proponents propose to keep their single more-powerful AI friendly; by having lots of smart people think about it very carefully; before actually building the AI(s)”. That is the real answer.
Yes.
Hell no.
Sounds like a good way to go extinct. That is, unless the ‘constitution’ manages to implement friendliness.
I’m not too keen about the prospect either. But it may well become a choice between that and certain doom.
And I just don’t trust anyone pushing for the first option—particularly when they want to be the one who defines “friendly”.
Am I to interpret that expletive as expressing that you already have a pretty good feel regarding whether you would want that?
We’ll get to the definition of “friendliness” in a moment. What I think is crucial is that the constitution implements some form of “fairness” and that the AI’s and constitution together advance some meta-goals like tolerance and communication and understanding other viewpoints.
As to “friendliness”, the thing I most dislike about the definition “friendliness” = “CEV” is that in Eliezer’s vision, it seems that everyone wants the same things. In my opinion, on the other hand, the mechanisms for resolution of conflicting objectives constitute the real core of the problem. And I believe that the solutions pretty much already exist, in standard academic rational agent game theory. With AIs assisting, and with a constitution granting humans equal power over each other and over AIs, and granting AIs power only over each other, I think we can create a pretty good future.
With one big AI, whose “friendliness” circuits have been constructed by a megalomaniac who seems to believe in a kind of naive utilitarianism with direct interpersonal comparison of utility and discounting of the future forbidden; … well …, I see this kind of future as a recipe for disaster.
He doesn’t think that—but he does seem to have some rather curious views of the degree of similarity between humans.
Hopefully, having posted this publicly means you’ll never get the opportunity.
Meanwhile I’m hoping that me having posted the obvious publicly means there is a minuscule reduction the the chance that someone else will get the opportunity.
The ones to worry about are those who pretend to be advocating goal systems that are a little naive to be true.