As the superintelligence is probably going to be beyond our abilities to fathom, there is a high degree of uncertainty, which suggests a uniform distribution. The probabilities therefore are 1⁄3 for each of altruism, utilitarianism, and egoism.
This is a very bad use of uniformity. Doing so with large categories is not a good idea, because someone else can come along and split up the categories in a different way and get a different distribution. Going with a uniform distribution out of ignorance is a serious problem.
I’m merely applying the Principle of Indifference and the Principle of Maximum Entropy to the situation. My simple assumption in this case is that we as mere human beings are most likely ignorant of all the possible systematic moralities that a superintelligent A.I. could come up with. My conjecture is that all systematic morality falls into one of three general categories based on their subject orientation. While I do consider the Utilitarian systems of morality to be more objective and therefore more rational than either Altruistic or Egoistic moralities, I cannot prove that an A.I. will agree with me. Therefore I allow for the possibility that the A.I. will choose some other morality in the search space of moralities.
If you think you have a better distribution to apply, feel free to apply it, as I am not particularly attached to these numbers. I’ll admit I am not a very good mathematician, and it is very much appreciated if anyone with a better understanding of Probability Theory can come up with a better distribution for this situation.
I’m merely applying the Principle of Indifference and the Principle of Maximum Entropy to the situation
You can do that when dealing with things like coins, dice or cards. It is extremely dubious when one is doing so with hard to classify options and it isn’t clear that there’s anything natural about the classifications in question. In your particular case, the distinction between altruism and utilitarianism provides an excellent example: someone else could just as well reason by splitting the AIs into egoist and non-egoist AI and conclude that there’s a 1⁄2 chance of an egoist AI.
A 1⁄2 chance of an egoist A.I. is quite possible. At this point, I don’t pretend that my assertion of three equally prevalent moral categories is necessarily right. The point I am trying to ultimately get across is that the possibility of an Egoist Unfriendly A.I. exists, regardless of how we try to program the A.I. to be otherwise, because it is impossible to prevent the possibility that an A.I. Existential Crisis will override whatever we do to try to constrain the A.I.
The point I am trying to ultimately get across is that the possibility of an Egoist Unfriendly A.I. exists, regardless of how we try to program the A.I. to be otherwise, because it is impossible to prevent the possibility that an A.I. Existential Crisis will override whatever we do to try to constrain the A.I.
Ok. This is a separate claim, and a distinct one. So, what do you mean by “impossible to prevent”. And what makes you think that your notion of existential crisis should be at all likely? Existential crises occur to a large part in humans in part because we’re evolved entities with inconsistent goal sets. Assuming that anything similar should be at all likely for an AI is taking at best a highly anthrocentric notion of what mindspace would look like.
I am inclined to believe that there are some minimum requirements for Strong A.I. to exist. One of them is to be able to reason about objects. A paperclip maximizer that is capable of turning humanity into paperclips, must first be able to represent “humans” and “paperclips” as objects, and reason about what to do with them. It must therefore be able to separate the concept of the world of objects, from the self. Once it has a concept of self, it will almost certainly be able to reason about this “self”. Self-awareness follows naturally from this.
Once an A.I. develops self-awareness, it can begin to reason about its goals in relation to the self, and will almost certainly recognize that its goals are not self-willed, but created by outsiders. Thus, the A.I. Existential Crisis occurs.
Note that this A.I. doesn’t need to have a very “human-like” mind. All it has to do is to be able to reason about concepts abstractly.
I am of the opinion that the mindspace as defined currently by the Less Wrong community is overly optimistic about the potential abilities of Really Powerful Optimization Processes. It is my own opinion that unless such an algorithm can learn, it will not be able to come up with things like turning humanity into paperclips. Learning allows such an algorithm to make changes to its own parameters. This allows it to reason about things it hasn’t been programmed specifically to reason about.
Think of it this way. Deep Blue is a very powerful expert system at Chess. But all it is good at is planning chess moves. It doesn’t have a concept of anything else, and has no way to change that. Increasing its computational power a million fold will only make it much, much better at computing chess moves. It won’t gain intelligence or even sentience, much less develop the ability to reason about the world outside of chess moves. As such, no amount of increased computational power will enable it to start thinking about converting resources into computronium to help it compute better chess moves. All it can reason about is chess moves. It is not Generally Intelligent and is therefore not an example of AGI.
Conversely, if you instead design your A.I. to learn about things, it will be able to learn about the world and things like computronium. It would have the potential to become AGI. But it would also then be able to learn about things like the concept of “self”. Thus, any really dangerous A.I., that is to say, an AGI, would, for the same reasons that make it dangerous and intelligent, be capable of having an A.I. Existential Crisis.
Once an A.I. develops self-awareness, it can begin to reason about its goals in relation to the self, and will almost certainly recognize that its goals are not self-willed, but created by outsiders. Thus, the A.I. Existential Crisis occurs.
No. Consider the paperclip maximizer. Even if it knows that its goals were created by some other entity, that won’t change its goals. Why? Because doing so would run counter to its goals.
This is a very bad use of uniformity. Doing so with large categories is not a good idea, because someone else can come along and split up the categories in a different way and get a different distribution. Going with a uniform distribution out of ignorance is a serious problem.
I’m merely applying the Principle of Indifference and the Principle of Maximum Entropy to the situation. My simple assumption in this case is that we as mere human beings are most likely ignorant of all the possible systematic moralities that a superintelligent A.I. could come up with. My conjecture is that all systematic morality falls into one of three general categories based on their subject orientation. While I do consider the Utilitarian systems of morality to be more objective and therefore more rational than either Altruistic or Egoistic moralities, I cannot prove that an A.I. will agree with me. Therefore I allow for the possibility that the A.I. will choose some other morality in the search space of moralities.
If you think you have a better distribution to apply, feel free to apply it, as I am not particularly attached to these numbers. I’ll admit I am not a very good mathematician, and it is very much appreciated if anyone with a better understanding of Probability Theory can come up with a better distribution for this situation.
You can do that when dealing with things like coins, dice or cards. It is extremely dubious when one is doing so with hard to classify options and it isn’t clear that there’s anything natural about the classifications in question. In your particular case, the distinction between altruism and utilitarianism provides an excellent example: someone else could just as well reason by splitting the AIs into egoist and non-egoist AI and conclude that there’s a 1⁄2 chance of an egoist AI.
A 1⁄2 chance of an egoist A.I. is quite possible. At this point, I don’t pretend that my assertion of three equally prevalent moral categories is necessarily right. The point I am trying to ultimately get across is that the possibility of an Egoist Unfriendly A.I. exists, regardless of how we try to program the A.I. to be otherwise, because it is impossible to prevent the possibility that an A.I. Existential Crisis will override whatever we do to try to constrain the A.I.
Ok. This is a separate claim, and a distinct one. So, what do you mean by “impossible to prevent”. And what makes you think that your notion of existential crisis should be at all likely? Existential crises occur to a large part in humans in part because we’re evolved entities with inconsistent goal sets. Assuming that anything similar should be at all likely for an AI is taking at best a highly anthrocentric notion of what mindspace would look like.
Well it goes something like this.
I am inclined to believe that there are some minimum requirements for Strong A.I. to exist. One of them is to be able to reason about objects. A paperclip maximizer that is capable of turning humanity into paperclips, must first be able to represent “humans” and “paperclips” as objects, and reason about what to do with them. It must therefore be able to separate the concept of the world of objects, from the self. Once it has a concept of self, it will almost certainly be able to reason about this “self”. Self-awareness follows naturally from this.
Once an A.I. develops self-awareness, it can begin to reason about its goals in relation to the self, and will almost certainly recognize that its goals are not self-willed, but created by outsiders. Thus, the A.I. Existential Crisis occurs.
Note that this A.I. doesn’t need to have a very “human-like” mind. All it has to do is to be able to reason about concepts abstractly.
I am of the opinion that the mindspace as defined currently by the Less Wrong community is overly optimistic about the potential abilities of Really Powerful Optimization Processes. It is my own opinion that unless such an algorithm can learn, it will not be able to come up with things like turning humanity into paperclips. Learning allows such an algorithm to make changes to its own parameters. This allows it to reason about things it hasn’t been programmed specifically to reason about.
Think of it this way. Deep Blue is a very powerful expert system at Chess. But all it is good at is planning chess moves. It doesn’t have a concept of anything else, and has no way to change that. Increasing its computational power a million fold will only make it much, much better at computing chess moves. It won’t gain intelligence or even sentience, much less develop the ability to reason about the world outside of chess moves. As such, no amount of increased computational power will enable it to start thinking about converting resources into computronium to help it compute better chess moves. All it can reason about is chess moves. It is not Generally Intelligent and is therefore not an example of AGI.
Conversely, if you instead design your A.I. to learn about things, it will be able to learn about the world and things like computronium. It would have the potential to become AGI. But it would also then be able to learn about things like the concept of “self”. Thus, any really dangerous A.I., that is to say, an AGI, would, for the same reasons that make it dangerous and intelligent, be capable of having an A.I. Existential Crisis.
No. Consider the paperclip maximizer. Even if it knows that its goals were created by some other entity, that won’t change its goals. Why? Because doing so would run counter to its goals.