If the UFAI convinced you of anything that wasn’t true during the process—outright lies about reality or math—or biased sampling of reality producing a biased mental image, like a story that only depicts one possibility where other possibilities are more probable—then we have a simple and direct critique.
If the UFAI never deceived you in the course of telling the story, but simple measures over the space of possible moral arguments you could hear and moralities you subsequently develop, produce a spread of extrapolated volitions “almost all” of whom think that the UFAI-inspired-you has turned into something alien and unvaluable—if it flew through a persuasive keyhole to produce a very noncentral future version of you who is disvalued by central clusters of you—then it’s the sort of thing a Coherent Extrapolated Volition would try to stop.
See also #1 on the list of New Humane Rights: “You have the right not to have the spread in your volition optimized away by an external decision process acting on unshared moral premises.”
You have the right not to have the spread in your volition optimized away by an external decision process acting on unshared moral premises.
You have the right to a system of moral dynamics complicated enough that you can only work it out by discussing it with other people who share most of it.
You have the right to be created by a creator acting under what that creator regards as a high purpose.
You have the right to exist predominantly in regions where you are having fun.
You have the right to be noticeably unique within a local world.
You have the right to an angel. If you do not know how to build an angel, one will be appointed for you.
You have the right to exist within a linearly unfolding time in which your subjective future coincides with your decision-theoretical future.
You have the right to remain cryptic.
-- Eliezer Yudkowsky
(originally posted sometime around 2005, probably earlier)
What about the least convenient world where human meta-moral computation doesn’t have the coherence that you assume? If you found yourself living in such a world, would you give up and say no meta-ethics is possible, or would you keep looking for one? If it’s the latter, and assuming you find it, perhaps it can be used in the “convenient” worlds as well?
To put it another way, it doesn’t seem right to me that the validity of one’s meta-ethics should depend on a contingent fact like that. Although perhaps instead of just complaining about it, I should try to think of some way to remove the dependency...
(We also disagree about the likelihood that the coherence assumption holds, but I think we went over that before, so I’m skipping it in the interest of avoiding repetition.)
I think this is about metamorals not metaethics—yes, I’m merely defining terms here, but I consider “What is moral?” and “What is morality made of?” to be problems that invoke noticeably different issues. We already know, at this point, what morality is made of; it’s a computation. Which computation? That’s a different sort of question and I don’t see a difficulty in having my answer depend on contingent facts I haven’t learned.
In response to your question: yes, if I had given a definition of moral progress where it turned out empirically that there was no coherence in the direction in which I was trying to point and the past had been a random walk, then I should reconsider my attempt to describe those changes as “progress”.
Which computation? That’s a different sort of question and I don’t see a difficulty in having my answer depend on contingent facts I haven’t learned.
How do you cash “which computation?” out to logical+physical uncertainty? Do you have in mind some well-defined metamoral computation that would output the answer?
I think you just asked me how to write an FAI. So long as I know that it’s made out of logical+physical uncertainty, though, I’m not confused in the same way that I was confused in say 1998.
“Well-specified” may have been too strong a term, then; I meant to include something like CEV as described in 2004.
Is there an infinite regress of not knowing how to compute morality, or how to compute (how to compute morality), or how to compute (how to compute (...)), that you need to resolve; do you currently think you have some idea of how it bottoms out; or is there a third alternative that I should be seeing?
it doesn’t seem right to me that the validity of one’s meta-ethics should depend on a contingent fact like that
I think it is a powerful secret of philosophy and AI design that all useful philosophy depends upon the philosopher(s) observing contingent facts from their sensory input stream. Philosophy can be thought of as an ultra high level machine learning technique that records the highest-level regularities of our input/output streams. And the reason I said that this is a powerful AI design principle, is that you realize that your AI can do good philosophy by looking for such regularities.
If the UFAI convinced you of anything that wasn’t true during the process—outright lies about reality or math—or biased sampling of reality producing a biased mental image, like a story that only depicts one possibility where other possibilities are more probable—then we have a simple and direct critique.
If the UFAI never deceived you in the course of telling the story, but simple measures over the space of possible moral arguments you could hear and moralities you subsequently develop, produce a spread of extrapolated volitions “almost all” of whom think that the UFAI-inspired-you has turned into something alien and unvaluable—if it flew through a persuasive keyhole to produce a very noncentral future version of you who is disvalued by central clusters of you—then it’s the sort of thing a Coherent Extrapolated Volition would try to stop.
See also #1 on the list of New Humane Rights: “You have the right not to have the spread in your volition optimized away by an external decision process acting on unshared moral premises.”
New Humane Rights:
You have the right not to have the spread in your volition optimized away by an external decision process acting on unshared moral premises.
You have the right to a system of moral dynamics complicated enough that you can only work it out by discussing it with other people who share most of it.
You have the right to be created by a creator acting under what that creator regards as a high purpose.
You have the right to exist predominantly in regions where you are having fun.
You have the right to be noticeably unique within a local world.
You have the right to an angel. If you do not know how to build an angel, one will be appointed for you.
You have the right to exist within a linearly unfolding time in which your subjective future coincides with your decision-theoretical future.
You have the right to remain cryptic.
-- Eliezer Yudkowsky
(originally posted sometime around 2005, probably earlier)
What about the least convenient world where human meta-moral computation doesn’t have the coherence that you assume? If you found yourself living in such a world, would you give up and say no meta-ethics is possible, or would you keep looking for one? If it’s the latter, and assuming you find it, perhaps it can be used in the “convenient” worlds as well?
To put it another way, it doesn’t seem right to me that the validity of one’s meta-ethics should depend on a contingent fact like that. Although perhaps instead of just complaining about it, I should try to think of some way to remove the dependency...
(We also disagree about the likelihood that the coherence assumption holds, but I think we went over that before, so I’m skipping it in the interest of avoiding repetition.)
I think this is about metamorals not metaethics—yes, I’m merely defining terms here, but I consider “What is moral?” and “What is morality made of?” to be problems that invoke noticeably different issues. We already know, at this point, what morality is made of; it’s a computation. Which computation? That’s a different sort of question and I don’t see a difficulty in having my answer depend on contingent facts I haven’t learned.
In response to your question: yes, if I had given a definition of moral progress where it turned out empirically that there was no coherence in the direction in which I was trying to point and the past had been a random walk, then I should reconsider my attempt to describe those changes as “progress”.
How do you cash “which computation?” out to logical+physical uncertainty? Do you have in mind some well-defined metamoral computation that would output the answer?
I think you just asked me how to write an FAI. So long as I know that it’s made out of logical+physical uncertainty, though, I’m not confused in the same way that I was confused in say 1998.
“Well-specified” may have been too strong a term, then; I meant to include something like CEV as described in 2004.
Is there an infinite regress of not knowing how to compute morality, or how to compute (how to compute morality), or how to compute (how to compute (...)), that you need to resolve; do you currently think you have some idea of how it bottoms out; or is there a third alternative that I should be seeing?
I think it is a powerful secret of philosophy and AI design that all useful philosophy depends upon the philosopher(s) observing contingent facts from their sensory input stream. Philosophy can be thought of as an ultra high level machine learning technique that records the highest-level regularities of our input/output streams. And the reason I said that this is a powerful AI design principle, is that you realize that your AI can do good philosophy by looking for such regularities.