In your view, is there such a thing as the best rationalization of one’s values, or is any rationalization as good as another? If there is a best rationalization, what are its properties? For example, should I try to make my normative theory fit my emotions as closely as possible, or also take simplicity and/or elegance into consideration?
What counts as a virtue in any model depends on what you’re using that model for. If you’re chiefly concerned with accuracy then you want your normative theory to fit your values as much as possible. But maybe the most accurate model takes to long to run on your hardware- in that case you might prefer a simpler, more elegant model. Maybe there are hard limits to how accurate we can make such models and will be willing to settle for good enough.
What if, as seems likely, I find out that the most straightforward translation of my emotions into a utility function gives a utility function that is based on a crazy ontology, and it’s not clear how to translate my emotions into a utility function based on the true ontology of the world (or my current best guess as to the true ontology). What should I do then?
Whatever our best ontology is it will always have some loose analog in our evolved, folk ontology. So we should try our best to to make it fit. There will always be weird edge cases that arise as our ontology improves and our circumstances diverge from our ancestor’s i.e. “are fetuses in the class of things we should have empathy for?” Expecting evolution to have encoded an elegant set of principles in the true ontology is obviously crazy. There isn’t much one can do about it if you want to preserve your values. You could decide that you care more about obeying a simple, elegant moral code than you do about your moral intuition/emotional response (perhaps because you have a weak or abnormal emotional response to begin with). Whether you should do one or the other is just a meta moral judgment and people will have different answers because the answer depends on their psychological disposition. But I think realizing that we aren’t talking about facts but trying to describe what we value makes elegance and simplicity seem less important.
There isn’t much one can do about it if you want to preserve your values.
I dispute the assumption that my emotions represent my values. Since the part of me that has to construct a utility function (let’s say for the purpose of building an FAI) is the deliberative thinking part, why shouldn’t I (i.e., that part of me) dis-identify with my emotional side? Suppose I do, then there’s no reason for me to rationalize “my” emotions (since I view them as just the emotions of a bunch of neurons that happen to be attached to me). Instead, I could try to figure out from abstract reasoning alone what I should value (falling back to nihilism if ultimately needed).
According to anti-realism, this is just as valid a method of coming up with a normative theory as any other (that somebody might have the psychological disposition to choose), right?
Alternatively, what if I think the above may be something I should do, but I’m not sure? Does anti-realism offer any help besides that it’s “just a meta moral judgment and people will have different answers because the answer depends on their psychological disposition”?
A superintelligent moral psychologist might tell me that there is one text file, which if I were to read it, would cause me to do what I described earlier, and another text file which would cause me to to choose to rationalize my emotions instead, and therefore I can’t really be said to have an intrinsic psychological disposition in this matter. What does anti-realism say is my morality in that case?
I dispute the assumption that my emotions represent my values.
Me too. There are people who consistently judge that their morality has “too little” motivational force, and there are people who perceive their morality to have “too much” motivational force. And there are people who deem themselves under-motivated by certain moral ideals and over-motivated by others. None of these would seem possible if moral beliefs simply echoed (projected) emotion. (One could, of course, object to one’s past or anticipated future motivation, but not one’s present; nor could the long-term averages disagree.)
Since the part of me that has to construct a utility function (let’s say for the purpose of building an FAI) is the deliberative thinking part, why shouldn’t I (i.e., that part of me) dis-identify with my emotional side? Suppose I do, then there’s no reason for me to rationalize “my” emotions (since I view them as just the emotions of a bunch of neurons that happen to be attached to me). Instead, I could try to figure out from abstract reasoning alone what I should value (falling back to nihilism if ultimately needed). According to anti-realism, this is just as valid a method of coming up with a normative theory as any other (that somebody might have the psychological disposition to choose), right?
First, this scenario is just impossible. One cannot dis-identify from one’s ‘emotional side’. Thats not a thing. If someone thinks they’re doing that they’ve probably smuggled their emotions into their abstract reasons (see, for example, Kant). Second, it seems silly, even dumb, to give up on making moral judgments and become a nihilist just because you’d like there be a way to determine moral principles from abstract reasoning alone. Most people are attached to their morality and would like to go on making judgments. If someone has such a strong psychological need to derive morality through abstract reasoning along that they’re just going to give up morality: so be it I guess. But that would be a very not-normal person and not at all the kind of person I would want to have programming an FAI.
But yes- ultimately my values enter into it and my values may not be everyone else’s. So of course there is no fact of the matter about the “right” way to do something. Nevertheless, there are still no moral facts.
You seem to be asking anti-realism to supply you with answers to normative questions. But what anti-realism tells you is that such questions don’t have factual answers. I’m telling you what morality is. To me, the answer has some implications for FAI but anti-realism certainly doesn’t answer questions that it says there aren’t answers to.
One cannot dis-identify from one’s ‘emotional side’. Thats not a thing.
In order to rationalize my emotions, I have to identify with them in the first place (as opposed to the emotions of my neighbor, say). Especially if I’m supposed to apply descriptive moral psychology, instead of just confabulating unreflectively based on whatever emotions I happen to feel at any given moment. But if I can identify with them, why can’t I dis-identify from them?
If someone thinks they’re doing that they’ve probably smuggled their emotions into their abstract reasons (see, for example, Kant).
That doesn’t stop me from trying. In fact moral psychology could be a great help in preventing such “contamination”.
You seem to be asking anti-realism to supply you with answers to normative questions. But what anti-realism tells you is that such questions don’t have factual answers.
If those questions don’t have factual answers, then I could answer them any way I want, and not be wrong. On the other hand if they do have factual answers, then I better use my abstract reasoning skills to find out what those answers are. So why shouldn’t I make realism the working assumption, if I’m even slightly uncertain that anti-realism is true? If that assumption turns out to be wrong, it doesn’t matter anyway—whatever answers I get from using that assumption, including nihilism, still can’t be wrong. (If I actually choose to make that assumption, then I must have a psychological disposition to make that assumption. So anti-realism would say that whatever normative theory I form under that assumption is my actual morality. Right?)
I’m telling you what morality is.
Can you answer the last question in the grandparent comment, which was asking just this sort of question?
If those questions don’t have factual answers, then I could answer them any way I want, and not be wrong.
That’s true as stated, but “not being wrong” isn’t the only thing you care about. According to your current morality, those questions have moral answers, and you shouldn’t answer them any way you want, because that could be evil.
When you say “you shouldn’t answer them any way you want” are you merely expressing an emotional dissatisfaction, like Jack?
If it’s meant to be more than an expression of emotional dissatisfaction, I guess “should” means “what my current morality recommends” and “evil” means “against my current morality”, but what do you mean by “current morality”? As far as I can tell, according to anti-realism, my current morality is whatever morality I have the psychological disposition to construct. So if I have the psychological disposition to construct it using my intellect alone (or any other way), how, according to anti-realism, could that be evil?
By “current morality” I mean that the current version of you may dislike some outcomes of your future moral deliberations if Omega shows them to you in advance. It’s quite possible that you have a psychological disposition to eventually construct a moral system that the current version of you will find abhorrent. For an extreme test case, imagine that your long-term “psychological dispositions” are actually coming from a random number generator; that doesn’t mean you cannot make any moral judgments today.
It’s quite possible that you have a psychological disposition to eventually construct a moral system that the current version of you will find abhorrent.
I agree it’s quite possible. Suppose I do somehow find out that the current version of me emotionally dislikes the outcomes of my future moral deliberations. I still have to figure out what to do about that. Is there a normative fact about what I should do in that case? Or is there only a psychological disposition?
I think there’s only a psychological disposition. If the future of your morals looked abhorrent enough to you, I guess you’d consider it moral to steer toward a different future.
Ultimately we seem to be arguing about the meaning of the word “morality” inside your head. Why should that concept obey any simple laws, given that it’s influenced by so many random factors inside and outside your head? Isn’t that like trying to extrapolate the eternally true meaning of the word “paperclip” based on your visual recognition algorithms, which can also crash on hostile input?
I appreciate your desire to find some math that could help answer moral questions that seem too difficult for our current morals. But I don’t see how that’s possible, because our current morals are very messy and don’t seem to have any nice invariants.
Why should that concept obey any simple laws, given that it’s influenced by so many random factors inside and outside your head?
Every concept is influenced by many random factors inside and outside my head, which does not rule out that some concepts can be simple. I’ve already given one possible way in which that concept can be simple: someone might be a strong deliberative thinker and decide to not base his morality on his emotions or other “random factors” unless he can determine that there’s a normative fact that he should do so.
Emotions are just emotions. They do not bind us, like a utility function binds an EU maximizer. We’re free to pick a morality that is not based on our emotions. If we do have a utility function, it’s one that we can’t see at this point, and I see no strong reason to conclude that it must be complex.
Isn’t that like trying to extrapolate the eternally true meaning of the word “paperclip” based on your visual recognition algorithms, which can also crash on hostile input?
How do we know it’s not more like trying to extrapolate the eternally true meaning of the word “triangle”?
But I don’t see how that’s possible, because our current morals are very messy and don’t seem to have any nice invariants.
Thinking that humans have a “current morality” seems similar to a mistake that I was on the verge of making before, of thinking that humans have a “current decision theory” and therefore we can solve the FAI decision theory problem by finding out what our current decision theory is, and determining what it says we should program the FAI with. But in actuality, we don’t have a current decision theory. Our “native” decision making mechanisms (the ones described in Luke’s tutorial) can be overridden by our intellect, and no “current decision theory” governs that part of our brains. (A CDT theorist can be convinced to give up CDT, and not just for XDT, i.e., what a CDT agent would actually self-modify into.) So we have to solve that problem with “philosophy” and I think the situation with morality may be similar, since there is no apparent “current morality” that governs our intellect.
How do we know it’s not more like trying to extrapolate the eternally true meaning of the word “triangle”?
Even without going into the complexities of human minds: do you mean triangle in formal Euclidean geometry, or triangle in the actual spacetime we’re living in? The latter concept can become arbitrarily complex as we discover new physics, and the former one is an approximation that’s simple because it was selected for simplicity (being easy to use in measuring plots of land and such). Why you expect the situation to be different for “morality”?
In order to rationalize my emotions, I have to identify with them in the first place (as opposed to the emotions of my neighbor, say). Especially if I’m supposed to apply descriptive moral psychology, instead of just confabulating unreflectively based on whatever emotions I happen to feel at any given moment. But if I can identify with them, why can’t I dis-identify from them?
I’m not sure I actually understand what you mean by “dis-identify”.
If those questions don’t have factual answers, then I could answer them any way I want, and not be wrong. On the other hand if they do have factual answers, then I better use my abstract reasoning skills to find out what those answers are. So why shouldn’t I make realism the working assumption, if I’m even slightly uncertain that anti-realism is true? If that assumption turns out to be wrong, it doesn’t matter anyway—whatever answers I get from using that assumption, including nihilism, still can’t be wrong.
So Pascal’s Wager?
In any case, while there aren’t wrong answers there are still immoral ones. There is no fact of the matter about normative ethics- but there are still hypothetical AIs that do evil things.
In any case, while there aren’t wrong answers there are still immoral ones. There is no fact of the matter about normative ethics- but there are still hypothetical AIs that do evil things.
Then there is fact of the matter about which answers are moral, and we might as well call those that aren’t, “incorrect”.
Then there is fact of the matter about which answers are moral, and we might as well call those that aren’t, “incorrect”.
It seems like a waste to overload the meaning of the word “incorrect” to also include such things as “Fuck off! That doesn’t satisfy socially oriented aspects of my preferences. I wish to enforce different norms!”
It really is useful to emphasize a carve in reality between ‘false’ and ‘evil/bad/immoral’. Humans are notoriously bad at keeping the concepts distinct in their minds and allowing ‘incorrect’ (and related words) to be used for normative claims encourages even more motivated confusion.
No. Moral properties don’t exist. What I’m doing, per the post, when I say “There are immoral answers” is expressing an emotional dissatisfaction to certain answers.
What counts as a virtue in any model depends on what you’re using that model for. If you’re chiefly concerned with accuracy then you want your normative theory to fit your values as much as possible. But maybe the most accurate model takes to long to run on your hardware- in that case you might prefer a simpler, more elegant model. Maybe there are hard limits to how accurate we can make such models and will be willing to settle for good enough.
Whatever our best ontology is it will always have some loose analog in our evolved, folk ontology. So we should try our best to to make it fit. There will always be weird edge cases that arise as our ontology improves and our circumstances diverge from our ancestor’s i.e. “are fetuses in the class of things we should have empathy for?” Expecting evolution to have encoded an elegant set of principles in the true ontology is obviously crazy. There isn’t much one can do about it if you want to preserve your values. You could decide that you care more about obeying a simple, elegant moral code than you do about your moral intuition/emotional response (perhaps because you have a weak or abnormal emotional response to begin with). Whether you should do one or the other is just a meta moral judgment and people will have different answers because the answer depends on their psychological disposition. But I think realizing that we aren’t talking about facts but trying to describe what we value makes elegance and simplicity seem less important.
I dispute the assumption that my emotions represent my values. Since the part of me that has to construct a utility function (let’s say for the purpose of building an FAI) is the deliberative thinking part, why shouldn’t I (i.e., that part of me) dis-identify with my emotional side? Suppose I do, then there’s no reason for me to rationalize “my” emotions (since I view them as just the emotions of a bunch of neurons that happen to be attached to me). Instead, I could try to figure out from abstract reasoning alone what I should value (falling back to nihilism if ultimately needed).
According to anti-realism, this is just as valid a method of coming up with a normative theory as any other (that somebody might have the psychological disposition to choose), right?
Alternatively, what if I think the above may be something I should do, but I’m not sure? Does anti-realism offer any help besides that it’s “just a meta moral judgment and people will have different answers because the answer depends on their psychological disposition”?
A superintelligent moral psychologist might tell me that there is one text file, which if I were to read it, would cause me to do what I described earlier, and another text file which would cause me to to choose to rationalize my emotions instead, and therefore I can’t really be said to have an intrinsic psychological disposition in this matter. What does anti-realism say is my morality in that case?
Me too. There are people who consistently judge that their morality has “too little” motivational force, and there are people who perceive their morality to have “too much” motivational force. And there are people who deem themselves under-motivated by certain moral ideals and over-motivated by others. None of these would seem possible if moral beliefs simply echoed (projected) emotion. (One could, of course, object to one’s past or anticipated future motivation, but not one’s present; nor could the long-term averages disagree.)
See “weak internalism”. There can still be competing motivational forces and non-moral emotions.
First, this scenario is just impossible. One cannot dis-identify from one’s ‘emotional side’. Thats not a thing. If someone thinks they’re doing that they’ve probably smuggled their emotions into their abstract reasons (see, for example, Kant). Second, it seems silly, even dumb, to give up on making moral judgments and become a nihilist just because you’d like there be a way to determine moral principles from abstract reasoning alone. Most people are attached to their morality and would like to go on making judgments. If someone has such a strong psychological need to derive morality through abstract reasoning along that they’re just going to give up morality: so be it I guess. But that would be a very not-normal person and not at all the kind of person I would want to have programming an FAI.
But yes- ultimately my values enter into it and my values may not be everyone else’s. So of course there is no fact of the matter about the “right” way to do something. Nevertheless, there are still no moral facts.
You seem to be asking anti-realism to supply you with answers to normative questions. But what anti-realism tells you is that such questions don’t have factual answers. I’m telling you what morality is. To me, the answer has some implications for FAI but anti-realism certainly doesn’t answer questions that it says there aren’t answers to.
In order to rationalize my emotions, I have to identify with them in the first place (as opposed to the emotions of my neighbor, say). Especially if I’m supposed to apply descriptive moral psychology, instead of just confabulating unreflectively based on whatever emotions I happen to feel at any given moment. But if I can identify with them, why can’t I dis-identify from them?
That doesn’t stop me from trying. In fact moral psychology could be a great help in preventing such “contamination”.
If those questions don’t have factual answers, then I could answer them any way I want, and not be wrong. On the other hand if they do have factual answers, then I better use my abstract reasoning skills to find out what those answers are. So why shouldn’t I make realism the working assumption, if I’m even slightly uncertain that anti-realism is true? If that assumption turns out to be wrong, it doesn’t matter anyway—whatever answers I get from using that assumption, including nihilism, still can’t be wrong. (If I actually choose to make that assumption, then I must have a psychological disposition to make that assumption. So anti-realism would say that whatever normative theory I form under that assumption is my actual morality. Right?)
Can you answer the last question in the grandparent comment, which was asking just this sort of question?
That’s true as stated, but “not being wrong” isn’t the only thing you care about. According to your current morality, those questions have moral answers, and you shouldn’t answer them any way you want, because that could be evil.
When you say “you shouldn’t answer them any way you want” are you merely expressing an emotional dissatisfaction, like Jack?
If it’s meant to be more than an expression of emotional dissatisfaction, I guess “should” means “what my current morality recommends” and “evil” means “against my current morality”, but what do you mean by “current morality”? As far as I can tell, according to anti-realism, my current morality is whatever morality I have the psychological disposition to construct. So if I have the psychological disposition to construct it using my intellect alone (or any other way), how, according to anti-realism, could that be evil?
By “current morality” I mean that the current version of you may dislike some outcomes of your future moral deliberations if Omega shows them to you in advance. It’s quite possible that you have a psychological disposition to eventually construct a moral system that the current version of you will find abhorrent. For an extreme test case, imagine that your long-term “psychological dispositions” are actually coming from a random number generator; that doesn’t mean you cannot make any moral judgments today.
I agree it’s quite possible. Suppose I do somehow find out that the current version of me emotionally dislikes the outcomes of my future moral deliberations. I still have to figure out what to do about that. Is there a normative fact about what I should do in that case? Or is there only a psychological disposition?
I think there’s only a psychological disposition. If the future of your morals looked abhorrent enough to you, I guess you’d consider it moral to steer toward a different future.
Ultimately we seem to be arguing about the meaning of the word “morality” inside your head. Why should that concept obey any simple laws, given that it’s influenced by so many random factors inside and outside your head? Isn’t that like trying to extrapolate the eternally true meaning of the word “paperclip” based on your visual recognition algorithms, which can also crash on hostile input?
I appreciate your desire to find some math that could help answer moral questions that seem too difficult for our current morals. But I don’t see how that’s possible, because our current morals are very messy and don’t seem to have any nice invariants.
Every concept is influenced by many random factors inside and outside my head, which does not rule out that some concepts can be simple. I’ve already given one possible way in which that concept can be simple: someone might be a strong deliberative thinker and decide to not base his morality on his emotions or other “random factors” unless he can determine that there’s a normative fact that he should do so.
Emotions are just emotions. They do not bind us, like a utility function binds an EU maximizer. We’re free to pick a morality that is not based on our emotions. If we do have a utility function, it’s one that we can’t see at this point, and I see no strong reason to conclude that it must be complex.
How do we know it’s not more like trying to extrapolate the eternally true meaning of the word “triangle”?
Thinking that humans have a “current morality” seems similar to a mistake that I was on the verge of making before, of thinking that humans have a “current decision theory” and therefore we can solve the FAI decision theory problem by finding out what our current decision theory is, and determining what it says we should program the FAI with. But in actuality, we don’t have a current decision theory. Our “native” decision making mechanisms (the ones described in Luke’s tutorial) can be overridden by our intellect, and no “current decision theory” governs that part of our brains. (A CDT theorist can be convinced to give up CDT, and not just for XDT, i.e., what a CDT agent would actually self-modify into.) So we have to solve that problem with “philosophy” and I think the situation with morality may be similar, since there is no apparent “current morality” that governs our intellect.
Even without going into the complexities of human minds: do you mean triangle in formal Euclidean geometry, or triangle in the actual spacetime we’re living in? The latter concept can become arbitrarily complex as we discover new physics, and the former one is an approximation that’s simple because it was selected for simplicity (being easy to use in measuring plots of land and such). Why you expect the situation to be different for “morality”?
I’m not sure I actually understand what you mean by “dis-identify”.
So Pascal’s Wager?
In any case, while there aren’t wrong answers there are still immoral ones. There is no fact of the matter about normative ethics- but there are still hypothetical AIs that do evil things.
Which question exactly?
Then there is fact of the matter about which answers are moral, and we might as well call those that aren’t, “incorrect”.
It seems like a waste to overload the meaning of the word “incorrect” to also include such things as “Fuck off! That doesn’t satisfy socially oriented aspects of my preferences. I wish to enforce different norms!”
It really is useful to emphasize a carve in reality between ‘false’ and ‘evil/bad/immoral’. Humans are notoriously bad at keeping the concepts distinct in their minds and allowing ‘incorrect’ (and related words) to be used for normative claims encourages even more motivated confusion.
No. Moral properties don’t exist. What I’m doing, per the post, when I say “There are immoral answers” is expressing an emotional dissatisfaction to certain answers.
True.