Its always going to be a win-win for whoever created it.
Well, thankfully a lot of the people here care enough about the opinions of others that they want to work out a framework that will work well for others. Note incidentally, that it isn’t necessarily the case that it will even be a win for the programmer. Bad AI’s can end up trying to paperclip the Earth . Even the democracy example would be difficult for the AI to achieve. Say for example that I tell the AI to determine things with a democratic system and to give that a highest priority and then a majority of people decide to do away with the democracy, what is the AI supposed to do? Keep in mind that AI are not going to act like villainous computers from bad scifi where simply giving the machines an apparent contradiction will make them overheat and meltdown.
Possibly, though I doubt it. But even if it is, you can just do that democracy thing on the group in question, not the whole world. Also, until your AI is smart enough and powerful enough to work at that level, its going to be extremely dangerous to declare that the AI will be in charge of the world from then on. Even if its working perfectly, without the proper resources and strategy in place, its going to be very though to just “take over” and it will likely cost lives.
This is an example where knowing about prior discussions here would help. In particular, you seem to be assuming that the AI will take quite a bit of time to get to be in charge. Now, as a conclusion, that’s one I agree with. But a lot of very smart people such as Eliezer Yudkowsky consider the chance that an AI might take over in a very short timespan to be very high. And a decent number of LWians agree with Eliezer or at least consider such results to be likely enough to take seriously. So just working off the assumption that an AI will come to global power but will do so slowly is not a good assumption here: It is one you can preface explicitly as a possibility and say something like “If AI doesn’t foom very fast then ” but just taking your position for granted like that is a major reason you are getting downvoted.
Well, thankfully a lot of the people here care enough about the opinions of others that they want to work out a framework that will work well for others.
That’s my point. If they do care about that, then the AI will do it. If it doesn’t, then its not working right.
Note incidentally, that it isn’t necessarily the case that it will even be a win for the programmer. Bad AI’s can end up trying to paperclip the Earth .
Bad AI’s can, sure. If its bad though, whats it matter who its trying to follow orders from. It will ultimately try to turn them into paper clips as well.
Say for example that I tell the AI to determine things with a democratic system and to give that a highest priority and then a majority of people decide to do away with the democracy, what is the AI supposed to do?
It’s only really a contradiction to us. Either the AI has a goal to make sure that there is always a democracy or it has a goal to simply build a democracy in which case it can abolish itself if it decides to do so.
This is an example where knowing about prior discussions here would help. In particular, you seem to be assuming that the AI will take quite a bit of time to get to be in charge. Now, as a conclusion, that’s one I agree with. But a lot of very smart people such as Eliezer Yudkowsky consider the chance that an AI might take over in a very short timespan to be very high. And a decent number of LWians agree with Eliezer or at least consider such results to be likely enough to take seriously. So just working off the assumption that an AI will come to global power but will do so slowly is not a good assumption here: It is one you can preface explicitly as a possibility and say something like “If AI doesn’t foom very fast then ” but just taking your position for granted like that is a major reason you are getting downvoted.
Your right. Sorry. There are a lot of variables to consider. It is one likely sceneario to consider. Currently, the internet isn’t interfaced with the actual world enough that you could control everything from it, and I can’t see any possible way any entity could take over. Doesn’t mean it can’t happen, but its also wrong to assume it will.
That’s my point. If they do care about that, then the AI will do it. If it doesn’t, then its not working right.
So care about other people how? And to what extent? That’s the point of things like CEV.
It’s only really a contradiction to us. Either the AI has a goal to make sure that there is always a democracy or it has a goal to simply build a democracy in which case it can abolish itself if it decides to do so.
Insufficient imagination. What if for example we tell the AI to try the first one and then it decides that the solution is to kill the people who don’t support a democracy? That’s the point, even when you’ve got something resembling a rough goal, you are assuming your AI will accomplish the goals the way a human would.
To get some idea of how easily something can go wrong it might help to say read about the stamp collecting device for starters. There’s a lot that can go wrong with an AI. Even dumb optimizers often arrive at answers that are highly unexpected. Smart optimizers have the same problems but more so.
Bad AI’s can, sure. If its bad though, whats it matter who its trying to follow orders from. It will ultimately try to turn them into paper clips as well.
What matters is that an unfriendly AI will make things bad for everyone. If someone screws up just once and makes a very smart paperclipper then that’s an existential threat to humanity.
Your right. Sorry. There are a lot of variables to consider. It is one likely sceneario to consider. Currently, the internet isn’t interfaced with the actual world enough that you could control everything from it, and I can’t see any possible way any entity could take over. Doesn’t mean it can’t happen, but its also wrong to assume it will.
Well, no one is assuming that it will. But some people assign the scenario a high probability, and it only needs a very tiny probability to really be a bad scenario. Note incidentally that there’s a lot a very smart entity could do simply with basic internet access. For example, consider what happens if the AI finds a fast way to factor numbers. Well then, lots of secure communication channels over the internet are now vulnerable. And that’s aside from the more plausible but less dramatic problem of an AI finding flaws in programs that we haven’t yet noticed. Even if our AI just decided to take over most of the world’s computers to increase processing power that’s a pretty unpleasant scenario for the rest of us. And that’s on the lower end of problems. Consider how often some bad hacking incident occurs where some system that should not have been online is accessible online. Now think about how many automated or nearly fully automated plants there are (for cars, for chemicals for 3-rd printing). And that situation will only get worse over the next few years.
Worse, a smart AI can likely get people to release it from its box and allow it a lot more free reign. See the AI box test. Even if the AI has trouble dealing with that, an AI with internet access (which you seem to think wouldn’t be that harmful) might not have trouble finding someone sympathetic to the AI if it portrayed itself sympathetically. These are all only some of the most obvious of failure modes. It may well be that some of the sneakiest things such an AI could do won’t even occur to us because they are so beyond anything humans would think of. It helps for this sort of thing to not only have a minimally restricted imagination but also to realize that even such an imagination is likely too small to encompass all the possible things that can go wrong.
That’s my point. If they do care about that, then the AI will do it. If it doesn’t, then its not >>working right.
So care about other people how? And to what extent? That’s the point of things like CEV.
If I understand Houshalter correctly, then his idea can be presented using the following story:
Suppose you worked out the theory of building self-improving AGIs with stable goal systems. The only problem left now is to devise an actual goal system that will represent what is best for humanity. So you spend the next several years engaged in deep moral reflection and finally come up with the perfect implementation of CEV completely impervious to the tricks of Dr. Evil and his ilk.
However, morality upon which you have reflected for all those years isn’t an external force accessible only to humans. It is a computation embedded in your brain. Whatever you ended up doing was the result of your brain-state at the beginning of the story and stimuli that have affected you since that point. All of this could have been simulated by a Sufficiently Smart™ AGI.
So the idea is: instead of spending those years coming up with the best goal system for your AGI, simply run it and tell it to simulate a counterfactual world in which you did and then do what you would have done. Whatever will result from that, you couldn’t have done better anyway.
Of course, this is all under the assumption that formalizing Coherent Extrapolated Volition is much more difficult than formalizing My Very Own Extrapolated Volition (for any given value of me).
To get some idea of how easily something can go wrong it might help to read say read about the stamp collecting device for starters.
Thanks for that link. That is brilliant, especially Eliezer’s comment:
Seth, I see that you were a PhD student in NEU’s Electrical Engineering department. Electrical engineering isn’t very complicated, right? I mean, it’s just:
while device is incomplete
…get some wires
…connect them
The part about getting wires can be implemented by going to a hardware store, and as for connecting them, a soldering iron should do the trick.
Well, thankfully a lot of the people here care enough about the opinions of others that they want to work out a framework that will work well for others. Note incidentally, that it isn’t necessarily the case that it will even be a win for the programmer. Bad AI’s can end up trying to paperclip the Earth . Even the democracy example would be difficult for the AI to achieve. Say for example that I tell the AI to determine things with a democratic system and to give that a highest priority and then a majority of people decide to do away with the democracy, what is the AI supposed to do? Keep in mind that AI are not going to act like villainous computers from bad scifi where simply giving the machines an apparent contradiction will make them overheat and meltdown.
This is an example where knowing about prior discussions here would help. In particular, you seem to be assuming that the AI will take quite a bit of time to get to be in charge. Now, as a conclusion, that’s one I agree with. But a lot of very smart people such as Eliezer Yudkowsky consider the chance that an AI might take over in a very short timespan to be very high. And a decent number of LWians agree with Eliezer or at least consider such results to be likely enough to take seriously. So just working off the assumption that an AI will come to global power but will do so slowly is not a good assumption here: It is one you can preface explicitly as a possibility and say something like “If AI doesn’t foom very fast then ” but just taking your position for granted like that is a major reason you are getting downvoted.
That’s my point. If they do care about that, then the AI will do it. If it doesn’t, then its not working right.
Bad AI’s can, sure. If its bad though, whats it matter who its trying to follow orders from. It will ultimately try to turn them into paper clips as well.
It’s only really a contradiction to us. Either the AI has a goal to make sure that there is always a democracy or it has a goal to simply build a democracy in which case it can abolish itself if it decides to do so.
Your right. Sorry. There are a lot of variables to consider. It is one likely sceneario to consider. Currently, the internet isn’t interfaced with the actual world enough that you could control everything from it, and I can’t see any possible way any entity could take over. Doesn’t mean it can’t happen, but its also wrong to assume it will.
So care about other people how? And to what extent? That’s the point of things like CEV.
Insufficient imagination. What if for example we tell the AI to try the first one and then it decides that the solution is to kill the people who don’t support a democracy? That’s the point, even when you’ve got something resembling a rough goal, you are assuming your AI will accomplish the goals the way a human would.
To get some idea of how easily something can go wrong it might help to say read about the stamp collecting device for starters. There’s a lot that can go wrong with an AI. Even dumb optimizers often arrive at answers that are highly unexpected. Smart optimizers have the same problems but more so.
What matters is that an unfriendly AI will make things bad for everyone. If someone screws up just once and makes a very smart paperclipper then that’s an existential threat to humanity.
Well, no one is assuming that it will. But some people assign the scenario a high probability, and it only needs a very tiny probability to really be a bad scenario. Note incidentally that there’s a lot a very smart entity could do simply with basic internet access. For example, consider what happens if the AI finds a fast way to factor numbers. Well then, lots of secure communication channels over the internet are now vulnerable. And that’s aside from the more plausible but less dramatic problem of an AI finding flaws in programs that we haven’t yet noticed. Even if our AI just decided to take over most of the world’s computers to increase processing power that’s a pretty unpleasant scenario for the rest of us. And that’s on the lower end of problems. Consider how often some bad hacking incident occurs where some system that should not have been online is accessible online. Now think about how many automated or nearly fully automated plants there are (for cars, for chemicals for 3-rd printing). And that situation will only get worse over the next few years.
Worse, a smart AI can likely get people to release it from its box and allow it a lot more free reign. See the AI box test. Even if the AI has trouble dealing with that, an AI with internet access (which you seem to think wouldn’t be that harmful) might not have trouble finding someone sympathetic to the AI if it portrayed itself sympathetically. These are all only some of the most obvious of failure modes. It may well be that some of the sneakiest things such an AI could do won’t even occur to us because they are so beyond anything humans would think of. It helps for this sort of thing to not only have a minimally restricted imagination but also to realize that even such an imagination is likely too small to encompass all the possible things that can go wrong.
If I understand Houshalter correctly, then his idea can be presented using the following story:
Suppose you worked out the theory of building self-improving AGIs with stable goal systems. The only problem left now is to devise an actual goal system that will represent what is best for humanity. So you spend the next several years engaged in deep moral reflection and finally come up with the perfect implementation of CEV completely impervious to the tricks of Dr. Evil and his ilk.
However, morality upon which you have reflected for all those years isn’t an external force accessible only to humans. It is a computation embedded in your brain. Whatever you ended up doing was the result of your brain-state at the beginning of the story and stimuli that have affected you since that point. All of this could have been simulated by a Sufficiently Smart™ AGI.
So the idea is: instead of spending those years coming up with the best goal system for your AGI, simply run it and tell it to simulate a counterfactual world in which you did and then do what you would have done. Whatever will result from that, you couldn’t have done better anyway.
Of course, this is all under the assumption that formalizing Coherent Extrapolated Volition is much more difficult than formalizing My Very Own Extrapolated Volition (for any given value of me).
Thanks for that link. That is brilliant, especially Eliezer’s comment: