I don’t even see how one would start to research the problem of getting a hypothetical AGI to recognize humans as distinguished beings. Solving this problem would seem to require as a prerequisite an understanding of the make up of the hypothetical AGI; something which people don’t seem to have a clear grasp of at the moment. Even if one does have a model for a hypothetical AGI, writing code conducive to it recognizing humans as distinguished beings seems like an intractable task.
If the nature of ethical properties, statements, attitudes, and judgments does ultimately correlate with human brains, it might be possible to derive mathematical models of moral terms or judgments from brain data. The problem with arriving at the meaning of morality solely by means of contemplation is that you risk introducing new meanings based on high-order cognition and intuitions, rather than figuring out what humans as a whole mean by morality.
Two possible steps towards friendly AI/CEV (just some quick ideas):
1.) We want the AGI (CEV) to extrapolate our volition in a certain, ethical way. That is, it shouldn’t for example create models of humans and hurt them just to figure out what we dislike. But in the end it won’t be enough to write blog posts in English. We might have to put real people into brain scanners and derive mathematically precise thresholds for states like general indisposition and unethical behavior. Such models could then be implemented into the utility-function of an AGI, while blog posts written in natural language can’t.
2.) We don’t know if CEV is itself wished for and considered ethical by most humans. If you do not assume that all humans are alike, what makes you think that your personal solution, your answer to those questions will be universally accepted? A rich white atheist male living in a western country who is interested in topics like philosophy and mathematics does not seem to be someone who can speak for the rest of the world. If we are very concerned with the ethics of CEV in and of itself, we might have to come up with a way to execute an approximation of CEV before AGI is invented. We might need massive, large-scale social experiments and surveys to see if something like CEV is even desirable. Writing a few vague blog posts about it doesn’t seem to get us the certainty we need before altering the universe irrevocably.
If CEV encounters a large proportion of the population that wish it was not run and will continue to do so after extrapolation, it simply stops and reports that fact. That’s one of the points of the method. It is, in and of itself a large scale social survey of present and future humanity. And if the groups that wouldn’t want it run now would after extrapolation, I’m fine with running it against their present wishes, and hope that if I were part of a group under similar circumstances someone else would do the same- “past me” is an idiot, I’m not much better, and “future me” is hopefully an even bigger improvement, while “desired future me” almost certainly is.
A related post, ‘Friendly AI Research and Taskification’:
If the nature of ethical properties, statements, attitudes, and judgments does ultimately correlate with human brains, it might be possible to derive mathematical models of moral terms or judgments from brain data. The problem with arriving at the meaning of morality solely by means of contemplation is that you risk introducing new meanings based on high-order cognition and intuitions, rather than figuring out what humans as a whole mean by morality.
Two possible steps towards friendly AI/CEV (just some quick ideas):
1.) We want the AGI (CEV) to extrapolate our volition in a certain, ethical way. That is, it shouldn’t for example create models of humans and hurt them just to figure out what we dislike. But in the end it won’t be enough to write blog posts in English. We might have to put real people into brain scanners and derive mathematically precise thresholds for states like general indisposition and unethical behavior. Such models could then be implemented into the utility-function of an AGI, while blog posts written in natural language can’t.
2.) We don’t know if CEV is itself wished for and considered ethical by most humans. If you do not assume that all humans are alike, what makes you think that your personal solution, your answer to those questions will be universally accepted? A rich white atheist male living in a western country who is interested in topics like philosophy and mathematics does not seem to be someone who can speak for the rest of the world. If we are very concerned with the ethics of CEV in and of itself, we might have to come up with a way to execute an approximation of CEV before AGI is invented. We might need massive, large-scale social experiments and surveys to see if something like CEV is even desirable. Writing a few vague blog posts about it doesn’t seem to get us the certainty we need before altering the universe irrevocably.
If CEV encounters a large proportion of the population that wish it was not run and will continue to do so after extrapolation, it simply stops and reports that fact. That’s one of the points of the method. It is, in and of itself a large scale social survey of present and future humanity. And if the groups that wouldn’t want it run now would after extrapolation, I’m fine with running it against their present wishes, and hope that if I were part of a group under similar circumstances someone else would do the same- “past me” is an idiot, I’m not much better, and “future me” is hopefully an even bigger improvement, while “desired future me” almost certainly is.