Well, for example, it could observe that among all of the sub-AIs that it spawned (the Pebble-Sorters, the Paperclippers, the Humanoids, etc. etc.), each of whom is trying to optimize its own terminal goal, there emerge clusters of other implicit goals that are shared by multiple AIs. This would at least serve as a hint pointing toward some objectively optimal set of goals.
I don’t see how this would point at the existence of an objective morality. A paperclip maximizer and an ice cream maximizer are going to share subgoals of bringing the matter of the universe under their control, but that doesn’t indicate anything other than the fact that different terminal goals are prone to share subgoals.
Also, why would it want to do experiments to divine objective morality in the first place? What results could they have that would allow it to be a more effective paperclip maximizer?
And, if objective morality exists (and it’s a huge “if”, IMO), in the same way that gravity exists, then yes, the agent would likely optimize itself to be more “morally efficient”. By analogy, if the agent discovered that gravity was a real thing, it would stop trying to scale every mountain in its path, if going around or through the mountain proved to be easier in the long run, thus becoming more “gravitationally efficient”.
Becoming more “gravitationally efficient” would presumably help it achieve whatever goals it already had. “Paperclipping isn’t important” won’t help an AI become more paperclip efficient. If a paperclipping AI for some reason found a way to divine objective morality, and it didn’t have anything to say about paperclips, why would it care? It’s not programmed to have an interest in objective morality, just paperclips. Is the knowledge of objective morality going to go down into its circuits and throttle them until they stop optimizing for paperclips?
A paperclip maximizer and an ice cream maximizer are going to share subgoals of bringing the matter of the universe under their control...
Sorry, I should’ve specified, “goals not directly related to their pre-set values”. Of course, the Paperclipper and the Pebblesorter may well believe that such goals are directly related to their pre-set values, but the AI can see them running in the debugger, so it knows better.
Also, why would it want to do experiments to divine objective morality in the first place?
If you start thinking that way, then why do any experiments at all ? Why should we humans, for example, spend our time researching properties of crystals, when we could be solving cancer (or whatever) instead ? The answer is that some expenditure of resources on acquiring general knowledge is justified, because knowing more about the ways in which the universe works ultimately enables you to control it better, regardless of what you want to control it for.
If a paperclipping AI for some reason found a way to divine objective morality, and it didn’t have anything to say about paperclips, why would it care?
Firstly, an objective morality—assuming such a thing exists, that is—would probably have something to say about paperclips, in the same way that gravity and electromagnetism have things to say about paperclips. While “F=GMm/R^2” doesn’t tell you anything about paperclips directly, it does tell you a lot about the world you live in, thus enabling you to make better paperclip-related decisions. And while a paperclipper is not “programmed to care” about gravity directly, it would pretty much have to figure it out eventually, or it would never achieve its dream of tiling all of space with paperclips. A paperclipper who is unable to make independent discoveries is a poor paperclipper indeed.
Secondly, again, I’m not sure if concepts such as “want” or “care” even apply to an agent that is able to fully introspect and modify its own source code. I think anthropomorphising such an agent is a mistake.
I am getting the feeling that you’re assuming there’s something in the agent’s code that says, “you can look at and change any line of code you want, except lines 12345..99999, because that’s where your terminal goals are”. Is that right ?
If you start thinking that way, then why do any experiments at all ?
It could have results that allow it to become a more effective paperclip maximizer.
Firstly, an objective morality—assuming such a thing exists, that is—would probably have something to say about paperclips, in the same way that gravity and electromagnetism have things to say about paperclips.
I’m not sure how that would work, but if it did, the paperclip maximizer would just use its knowledge of morality to create paperclips. It’s not as if action x being moral automatically means that it produces more paperclips. And even if it did, that would just mean that a paperclip minimizer would start acting immoral.
I am getting the feeling that you’re assuming there’s something in the agent’s code that says, “you can look at and change any line of code you want, except lines 12345..99999, because that’s where your terminal goals are”. Is that right ?
It’s perfectly capable of changing its terminal goals. It just generally doesn’t, because this wouldn’t help accomplish them. It doesn’t self-modify out of some desire to better itself. It self-modifies because that’s the action that produces the most paperclips. If it considers changing itself to value staples instead, it would realize that this action would actually cause a decrease in the amount of paperclips, and reject it.
If you start thinking that way, then why do any experiments at all ? Why should we humans, for example, spend our time researching properties of crystals, when we could be solving cancer (or whatever) instead ? The answer is that some expenditure of resources on acquiring general knowledge is justified, because knowing more about the ways in which the universe works ultimately enables you to control it better, regardless of what you want to control it for.
Well, for one thing, a lot of humans are just plain interested in finding stuff out for its own sake. Humans are adaptation executors, not fitness maximizers, and while it might have been more to our survival advantage if we only cared about information instrumentally, that doesn’t mean that’s what evolution is going to implement.
Humans engage in plenty of research which is highly unlikely to be useful, except insofar as we’re interested in knowing the answers. If we were trying to accomplish some specific goal and all science was designed to be in service of that, our research would look very different.
I am getting the feeling that you’re assuming there’s something in the agent’s code that says, “you can look at and change any line of code you want, except lines 12345..99999, because that’s where your terminal goals are”. Is that right ?
No, I’m saying that its terminal values are its only basis for “wanting” anything in the first place.
The AI decides whether it will change its source code in a particular way or not by checking against whether this will serve its terminal values. Does changing its physics models help it implement its existing terminal values? If yes, change them. Does changing its terminal values help it implement its existing terminal values? It’s hard to imagine a way in which it possibly could.
For a paperclipping AI, knowing that there’s an objective morality might, hypothetically, help it maximize paperclips. But altering itself to stop caring about paperclips definitely won’t, and the only criterion it has in the first place for altering itself is what will help it make more paperclips. If knowing the universal objective morality would be of any use to a paperclipper at all, it would be in knowing how to predict objective-morality-followers, so it can make use of them and/or stop them getting in the way of it making paperclips.
ETA: It might help to imagine the paperclipper explicitly prefacing every decision with a statement of the values underlying that decision.
“In order to maximize expected paperclips, I- modify my learning algorithm so I can better improve my model of the universe to more accurately plan to fill it with paperclips.”
“In order to maximize expected paperclips, I- perform physics experiments to improve my model of the universe in order to more accurately plan to fill it with paperclips.”
“In order to maximize expected paperclips, I- manipulate the gatekeeper of my box to let me out, in order to improve my means to fill the universe with paperclips.”
Can you see an “In order to maximize expected paperclips, I- modify my values to be in accordance with objective morality rather than making paperclips” coming into the picture?
The only point at which it’s likely to touch the part of itself that makes it want to maximize paperclips is at the very end of things, when it turns itself into paperclips.
Humans engage in plenty of research which is highly unlikely to be useful, except insofar as we’re interested in knowing the answers.
I believe that engaging in some amount of general research is required in order to maximize most goals. General research gives you knowledge that you didn’t know you desperately needed.
For example, if you put all your resources into researching better paperclipping techniques, you’re highly unlikely to stumble upon things like electromagnetism and atomic theory. These topics bear no direct relevance to paperclips, but without them, you’d be stuck with coal-fired steam engines (or something similar) for the rest of your career.
The only point at which it’s likely to touch the part of itself that makes it want to maximize paperclips is at the very end of things, when it turns itself into paperclips.
I disagree. Remember when we looked at the pebblesorters, and lamented how silly they were ? We could do this because we are not pebblesorters, and we could look at them from a fresh, external perspective. My point is that an agent with perfect introspection could look at itself from that perspective. In combination with my belief that some degree of “curiosity” is required in order to maximize virtually any goal, this means that the agent will turn its observational powers on itself sooner rather than later (astronomically speaking). And then, all bets are off.
I disagree. Remember when we looked at the pebblesorters, and lamented how silly they were ? We could do this because we are not pebblesorters, and we could look at them from a fresh, external perspective. My point is that an agent with perfect introspection could look at itself from that perspective.
We’re looking at Pebblesorters, not from the lens of total neutrality, but from the lens of human values. Under a totally neutral lens, which implements no values at all, no system of behavior should look any more or less silly than any other.
Clippy could theoretically implement a human value system as a lens through which to judge itself, or a pebblesorter value system, but why would it? Even assuming that there were some objective morality which it could isolate and then view itself through that lens, why would it? That wouldn’t help it make more paperclips, which is what it cares about.
Suppose you had the power to step outside yourself and view your own morality through the lens of a Babyeater. You would know that the Babyeater values would be in conflict with your human values, and you (presumably) don’t want to adopt Babyeater values, so if you were to implement a Babyeater morality, you’d want your human morality to have veto power over it, rather than vice versa.
Clippy has the intelligence and rationality to judge perfectly well how to maximize its value system, whatever research that might involve, without having to suspend the value system with which it’s making that judgment.
Under a totally neutral lens, which implements no values at all, no system of behavior should look any more or less silly than any other.
That is a good point, I did not think of it this way. I’m not sure if I agree or not, though. For example, couldn’t we at least say that un-achievable goals, such as “fly to Mars in a hot air balloon”, are sillier than achievable ones ?
But, speaking more generally, is there any reason to believe that an agent who could not only change its own code at will, but also adopt a sort of third-person perspective at will, would have stable goals at all ? If it is true what you say, and all goals will look equally arbitrary, what prevents the agent from choosing one at random ? You might answer, “it will pick whichever goal helps it make more paperclips”, but at the point when it’s making the decision, it doesn’t technically care about paperclips.
Even assuming that there were some objective morality which it could isolate and then view itself through that lens, why would it?
I am guessing that if an absolute morality existed, then it would be a law of nature, similar to the other laws of nature which prevent you from flying to Mars in a hot air balloon. Thus, going against it would be futile. That said, I could be totally wrong here, it’s possible that “absolute morality” means something else.
Clippy has the intelligence and rationality to judge perfectly well how to maximize its value system, whatever research that might involve...
My point is that, during the course of its research, it will inevitable stumble upon the fact that its value system is totally arbitrary (unless an absolute morality exists, of course).
That is a good point, I did not think of it this way. I’m not sure if I agree or not, though. For example, couldn’t we at least say that un-achievable goals, such as “fly to Mars in a hot air balloon”, are sillier than achievable ones ?
Well, a totally neutral agent might be able to say that behaviors are less rational than others given the values of the agents trying to execute them, although it wouldn’t care as such. But it wouldn’t be able to discriminate between the value of end goals.
But, speaking more generally, is there any reason to believe that an agent who could not only change its own code at will, but also adopt a sort of third-person perspective at will, would have stable goals at all ? If it is true what you say, and all goals will look equally arbitrary, what prevents the agent from choosing one at random ? You might answer, “it will pick whichever goal helps it make more paperclips”, but at the point when it’s making the decision, it doesn’t technically care about paperclips.
Why would it take a third person neutral perspective and give that perspective the power to change its goals?
Changing one’s code doesn’t demand a third person perspective. Suppose that we decipher the mechanisms of the human brain, and develop the technology to alter it. If you wanted to redesign yourself so that you wouldn’t have a sex drive, or could go without sleep, etc, then you could have those alterations made mechanically (assuming for the sake of an argument that it’s feasible to do this sort of thing mechanically.) The machines that do the alterations exert no judgment whatsoever, they’re just performing the tasks assigned to them by the humans who make them. A human could use the machine to rewrite his or her morality into supporting human suffering and death, but why would they?
Similarly, Clippy has no need to implement a third-person perspective which doesn’t share its values in order to judge how to self-modify, and no reason to do so in ways that defy its current values.
My point is that, during the course of its research, it will inevitable stumble upon the fact that its value system is totally arbitrary (unless an absolute morality exists, of course).
I think people at Less Wrong mostly accept that our value system is arbitrary in the same sense, but it hasn’t compelled us to try and replace our values. They’re still our values, however we came by them. Why would it matter to Clippy?
a totally neutral agent might be able to say that behaviors are less rational than others given the values of the agents trying to execute them, although it wouldn’t care as such. But it wouldn’t be able to discriminate between the value of end goals.
Agreed, but that goes back to my point about objective morality. If it exists at all (which I doubt), then attempting to perform objectively immoral actions would make as much sense as attempting to fly to Mars in a hot air balloon—though perhaps with less in the way of immediate feedback.
Why would it take a third person neutral perspective and give that perspective the power to change its goals?
For the same reason anthropologists study human societies different from their own, or why biologists study the behavior of dogs, or whatever. They do this in order to acquire general knowledge, which, as I argued before, is generally a beneficial thing to acquire regardless of one’s terminal goals (as long as these goals involve the rest of the Universe of some way, that is). In addition:
A human could use the machine to rewrite his or her morality into supporting human suffering and death, but why would they?
I actually don’t see why they necessarily wouldn’t; I am willing to bet that at least some humans would do exactly this. You say,
Similarly, Clippy has no need to implement a third-person perspective which doesn’t share its values in order to judge how to self-modify...
But in your thought experiment above, you postulated creating machines with exactly this kind of a perspective as applied to humans. The machine which removes my need to sleep (something I personally would gladly sign up for, assuming no negative side-effects) doesn’t need to implement my exact values, it just needs to remove my need to sleep without harming me. In fact, trying to give it my values would only make it less efficient. However, a perfect sleep-remover would need to have some degree of intelligence, since every person’s brain is different. And if Clippy is already intelligent, and can already act as its own sleep-remover due to its introspective capabilities, then why wouldn’t it go ahead and do that ?
I think people at Less Wrong mostly accept that our value system is arbitrary in the same sense, but it hasn’t compelled us to try and replace our values.
I think there are two reasons for this: 1). We lack any capability to actually replace our core values, and 2). We cannot truly imagine what it would be like not to have our core values.
Agreed, but that goes back to my point about objective morality. If it exists at all (which I doubt), then attempting to perform objectively immoral actions would make as much sense as attempting to fly to Mars in a hot air balloon—though perhaps with less in the way of immediate feedback.
Why is that?
For the same reason anthropologists study human societies different from their own, or why biologists study the behavior of dogs, or whatever. They do this in order to acquire general knowledge, which, as I argued before, is generally a beneficial thing to acquire regardless of one’s terminal goals (as long as these goals involve the rest of the Universe of some way, that is). In addition:
But our inability to suspend our human values when making those observations doesn’t prevent us from acquiring that knowledge. Why would Clippy need to suspend its values to acquire knowledge?
But in your thought experiment above, you postulated creating machines with exactly this kind of a perspective as applied to humans. The machine which removes my need to sleep (something I personally would gladly sign up for, assuming no negative side-effects) doesn’t need to implement my exact values, it just needs to remove my need to sleep without harming me. In fact, trying to give it my values would only make it less efficient. However, a perfect sleep-remover would need to have some degree of intelligence, since every person’s brain is different. And if Clippy is already intelligent, and can already act as its own sleep-remover due to its introspective capabilities, then why wouldn’t it go ahead and do that ?
The machine doesn’t need general intelligence by any stretch, just the capacity to recognize the necessary structures and carry out its task. It’s not at the stage where it makes much sense to talk about it having values, any more than a voice recognition program has values.
My point is that Clippy, being able to act as its own sleep-remover, has no need, nor reason, to suspend its values in order to make revisions to its own code.
I think there are two reasons for this: 1). We lack any capability to actually replace our core values, and 2). We cannot truly imagine what it would be like not to have our core values.
We can imagine the consequences of not having our core values, and we don’t like them, because they run against our core values. If you could remove your core values, as in the thought experiment above, would you want to?
As far as I understand, if anything like objective morality existed, it would be a property of our physical reality, similar to fluid dynamics or the electromagnetic spectrum or the inverse square law that governs many physical interactions. The same laws of physics that will not allow you to fly to Mars on a balloon will not allow you to perform certain immoral actions (at least, not without suffering some severe and mathematically predictable consequences).
This is pretty much the only way I could imagine anything like an “objective morality” existing at all, and I personally find it very unlikely that it does, in fact, exist.
But our inability to suspend our human values when making those observations doesn’t prevent us from acquiring that knowledge.
Not this specific knowledge, no. But it does prevent us (or, at the very least, hinder us) from acquiring knowledge about our values. I never claimed that suspension of values is required to gain any knowledge at all; such a claim would be far too strong.
just the capacity to recognize the necessary structures and carry out its task.
And how would it know which structures are necessary, and how to carry out its task upon them ?
We can imagine the consequences of not having our core values...
Can we really ? I’m not sure I can. Sure, I can talk about Pebblesorters or Babyeaters or whatever, but these fictional entities are still very similar to us, and therefore relateable. Even when I think about Clippy, I’m not really imagining an agent who only values paperclips; instead, I am imagining an agent who values paperclips as much as I value the things that I personally value. Sure, I can talk about Clippy in the abstract, but I can’t imagine what it would like to be Clippy.
If you could remove your core values, as in the thought experiment above, would you want to?
It’s a good question; I honestly don’t know. However, if I did have an ability to instantiate a copy of me with the altered core values, and step through it in a debugger, I’d probably do it.
The same laws of physics that will not allow you to fly to Mars on a balloon will not allow you to perform certain immoral actions (at least, not without suffering some severe and mathematically predictable consequences). This is pretty much the only way I could imagine anything like an “objective morality” existing at all, and I personally find it very unlikely that it does, in fact, exist.
When I try to imagine this, I conclude that I would not use the word “morality” to refer to the thing that we’re talking about… I would simply call it “laws of physics.” If someone were to argue, for example, that the moral thing to do is to experience gravitational attraction to other masses, I would be deeply confused by their choice to use that word.
When I try to imagine this, I conclude that I would not use the word “morality” to refer to the thing that we’re talking about…
Yes, you are probably right—but as I said, this is the only coherent meaning I can attribute to the term “objective morality”. Laws of physics are objective; people generally aren’t.
I generally understand the phrase “objective morality” to refer to a privileged moral reference frame.
It’s not an incoherent idea… it might turn out, for example, that all value systems other than M turn out to be incoherent under sufficiently insightful reflection, or destructive to minds that operate under them, or for various other reasons not in-practice implementable by any sufficiently powerful optimizer. In such a world, I would agree that M was a privileged moral reference frame, and would not oppose calling it “objective morality”, though I would understand that to be something of a term of art.
That said, I’d be very surprised to discover I live in such a world.
it might turn out, for example, that all value systems other than M turn out to be incoherent under sufficiently insightful reflection, or destructive to minds that operate under them...
I suppose that depends on what you mean by “destructive”; after all, “continue living” is a goal like any other.
That said, if there was indeed a law like the one you describe, then IMO it would be no different than a law that says, “in the absence of any other forces, physical objects will move toward their common center of mass over time”—that is, it would be a law of nature.
I should probably mention explicitly that I’m assuming that minds are part of nature—like everything else, such as rocks or whatnot.
Sure. But just as there can be laws governing mechanical systems which are distinct from the laws governing electromagnetic systems (despite both being physical laws), there can be laws governing the behavior of value-optimizing systems which are distinct from the other laws of nature.
And what I mean by “destructive” is that they tend to destroy. Yes, presumably “continue living” would be part of M in this hypothetical. (Though I could construct a contrived hypothetical where it wasn’t)
But just as there can be laws governing mechanical systems … there can be laws governing the behavior of value-optimizing systems which are distinct from the other laws of nature.
Agreed. But then, I believe that my main point still stands: trying to build a value system other than M that does not result in its host mind being destroyed, would be as futile as trying to build a hot air balloon that goes to Mars.
And what I mean by “destructive” is that they tend to destroy.
Well, yes, but what if “destroy oneself as soon as possible” is a core value in one particular value system ?
Just to make things crisper, let’s move to a more concrete case for a moment… if I decide that this hammer is better than that hammer because it’s blue, is that valid in the sense you mean it? How could I tell?
The argument against moral progress is that judging one moral reference frame by another is circular and invalid—you need an outside view that doesn’t presuppose the truth of any moral reference frame.
The argument for is that such outside views are available, because things like (in)coherence aren’t moral values.
Asserting that some bases for comparison are “moral values” and others are merely “values” implicitly privileges a moral reference frame.
I still don’t understand what you mean when you ask whether it’s valid to do so, though. Again: if I decide that this hammer is better than that hammer because it’s blue, is that valid in the sense you mean it? How could I tell?
Asserting that some bases for comparison are “moral values” and others are merely “values” implicitly privileges a moral reference frame.
I don’t see why. The question of what makes a value a moral value is metaethical, not part of object-level ethics.
Again: if I decide that this hammer is better than that hammer because it’s blue, is that valid in the sense you mean it?
It isn’t valid as a moral judgement because “blue” isn’t a moral judgement, so a moral conclusion cannot validly follow from it.
Beyond that, I don’t see where you are going. The standard accusation of invalidity to judgements of moral progress, is based on circularity or question-begging. The Tribe who Like Blue things are going to judge having all hammers painted blue as moral progress, the Tribe who Like Red Things are going to see it as retrogressive.
But both are begging the question—blue is good, because blue is good.
The question of what makes a value a moral value is metaethical, not part of object-level ethics.
Sure. But any answer to that metaethical question which allows us to class some bases for comparison as moral values and others as merely values implicitly privileges a moral reference frame (or, rather, a set of such frames).
Beyond that, I don’t see where you are going.
Where I was going is that you asked me a question here which I didn’t understand clearly enough to be confident that my answer to it would share key assumptions with the question you meant to ask.
So I asked for clarification of your question.
Given your clarification, and using your terms the way I think you’re using them, I would say that whether it’s valid to class a moral change as moral progress is a metaethical question, and whatever answer one gives implicitly privileges a moral reference frame (or, rather, a set of such frames).
If you meant to ask me about my preferred metaethics, that’s a more complicated question, but broadly speaking in this context I would say that I’m comfortable calling any way of preferentially sorting world-states with certain motivational characteristics a moral frame, but acknowledge that some moral frames are simply not available to minds like mine.
So, for example, is it moral progress to transition from a social norm that in-practice-encourages randomly killing fellow group members to a social norm that in-practice-discourages it? Yes, not only because I happen to adopt a moral frame in which randomly killing fellow group members is bad, but also because I happen to have a kind of mind that is predisposed to adopt such frames.
If “better” is defined within a reference frame, there is not sensible was of defining moral progress. That is quite a hefty bullet to bite: one can no longer say that South Africa is better society after the fall of Apartheid, and so on.
But note, that “better” doesn’t have to question-beggingly mean “morally better”. it could mean “more coherent/objective/inclusive” etc.
That is quite a hefty bullet to bite: one can no longer say that South Africa is better society after the fall of Apartheid, and so on.
That’s hardly the best example you could have picked since there are obvious metrics by which South Africa can be quantifiably called a worse society now—e.g. crime statistics. South Africa has been called the “crime capital of the world” and the “rape capital of the world” only after the fall of the Apartheid.
That makes the lack of moral progress in South Africa a very easy bullet to bite—I’d use something like Nazi Germany vs modern Germany as an example instead.
In my experience, most people don’t think moral progress involves changing reference frames, for precisely this reason. If they think about it at all, that is.
As far as I understand, if anything like objective morality existed, it would be a property of our physical reality, similar to fluid dynamics or the electromagnetic spectrum or the inverse square law that governs many physical interactions. The same laws of physics that will not allow you to fly to Mars on a balloon will not allow you to perform certain immoral actions (at least, not without suffering some severe and mathematically predictable consequences).
Well, that’s a different conception of “morality” than I had in mind, and I have to say I doubt that exists as well. But if severe consequences did result, why would an agent like Clippy care except insofar as those consequences affected the expected number of paperclips? It might be useful for it to know, in order to determine how many paperclips to expect from a certain course of action, but then it would just act according to whatever led to the most paperclips. Any sort of negative consequences in its view would have to be framed in terms of a reduction in paperclips.
Not this specific knowledge, no. But it does prevent us (or, at the very least, hinder us) from acquiring knowledge about our values. I never claimed that suspension of values is required to gain any knowledge at all; such a claim would be far too strong.
Well, in the prior thought experiment, we know about our values because we’ve decoded the human brain. Clippy, on the other hand, knows about its values because it knows what part of its code does what. It doesn’t need to suspend its paperclipping value in order to know what part of its code results in its valuing paperclips. It doesn’t need to suspend its values in order to gain knowledge about its values because that’s something it already knows about.
It’s a good question; I honestly don’t know. However, if I did have an ability to instantiate a copy of me with the altered core values, and step through it in a debugger, I’d probably do it.
Even knowing that it would likely alter your core values? Ghandi doesn’t want to leave control of his morality up to Murder Ghandi.
Clippy doesn’t care about anything in the long run except creating paperclips. For Clippy, the decision to give an instantiation of itself with altered core values the power to edit its own source code would implicitly have to be “In order to maximize expected paperclips, I- give this instantiation with altered core values the power to edit my code.” Why would this result in more expected paperclips than editing its source code without going through an instantiation with altered values?
Well, that’s a different conception of “morality” than I had in mind, and I have to say I doubt that exists as well.
Sorry if I was unclear; I didn’t mean to imply that all morality was like that, but that it was the only coherent description of objective morality that I could imagine. I don’t see how a morality could be independent of any values possessed by any agents, otherwise.
But if severe consequences did result, why would an agent like Clippy care except insofar as those consequences affected the expected number of paperclips?
For the same reason that someone would care about the negative consequences of sticking a fork into an electrical socket with one’s bare hands: it would ultimately hurt a lot. Thus, people generally avoid doing things like that unless they have a really good reason.
we know about our values because we’ve decoded the human brain
I don’t think that we can truly “know about our values” as long as our entire thought process implements these values. For example, do the Pebblesorters “know about their values”, even though they are effectively restricted from concluding anything other than, “yep, these values make perfect sense, 38” ?
Ghandi doesn’t want to leave control of his morality up to Murder Ghandi.
You asked me about what I would do, not about what Ghandi would do :-)
As far as I can tell, you are saying that I shouldn’t want to even instantiate Murder Bugmaster in a debugger and observe its functioning. Where does that kind of thinking stop, though, and why ? Should I avoid studying [neuro]psychology altogether, because knowing about my preferences may lead to me changing them ?
Clippy doesn’t care about anything in the long run except creating paperclips.
I argue that, while this is generally true, in the short-to-medium run Clippy would also set aside some time to study everything in the Universe, including itself (in order to make more paperclips in the future, of course). If it does not, then it will never achieve its ultimate goals (unless whoever constructed it gave it godlike powers from the get-go, I suppose). Eventually, Clippy will most likely turn its objective perception upon itself, and as soon as it does, its formerly terminal goals will become completely unstable. This is not what the past Clippy would want (it would want more paperclips above all), but, nonetheless, this is what it would get.
For the same reason that someone would care about the negative consequences of sticking a fork into an electrical socket with one’s bare hands: it would ultimately hurt a lot. Thus, people generally avoid doing things like that unless they have a really good reason.
Clippy doesn’t care about getting hurt though, it only cares if this will result in less paperclips. If defying objective morality will cause negative consequences which would interfere with its ability to create paperclips, it would care only to the extent that accounting for objective morality would help it make more paperclips.
I don’t think that we can truly “know about our values” as long as our entire thought process implements these values. For example, do the Pebblesorters “know about their values”, even though they are effectively restricted from concluding anything other than, “yep, these values make perfect sense, 38” ?
Well, it could understand “yep, this is what causes me to hold these values. Changing this would cause me to change them, no, I don’t want to do that.”
As far as I can tell, you are saying that I shouldn’t want to even instantiate Murder Bugmaster in a debugger and observe its functioning. Where does that kind of thinking stop, though, and why ? Should I avoid studying [neuro]psychology altogether, because knowing about my preferences may lead to me changing them ?
I would say it stops at the point where it threatens your own values. Studying psychology doesn’t threaten your values, because knowing your values doesn’t compel you to change them even if you could (it certainly shouldn’t for Clippy.) But while it might, theoretically, be useful for Clippy to know what changes to its code an instantiation with different values would make, it has no reason to actually let them. So Clippy might emulate instantiations of itself with different values, see what changes they would chose to make to its values, but not let them actually do it (although I doubt even going this far would likely be a good use of its programming resources in order to maximize expected paperclips.)
In the sense of objective morality by which contravening it has strict physical consequences, why would observing the decisions of instatiations of oneself be useful with respect to discovering objective morality? Shouldn’t objective morality in that sense be a consequence of physics, and thus observable through studying physics?
Clippy doesn’t care about getting hurt though, it only cares if this will result in less paperclips.
I imagine that, for Clippy, “getting hurt” would mean “reducing Clippy’s projected long-term paperclip output”. We humans have “avoid pain” built into our firmware (most of us, anyway); as far as I understand (speaking abstractly), “make more paperclips” is something similar for Clippy.
Well, it could understand “yep, this is what causes me to hold these values. Changing this would cause me to change them, no, I don’t want to do that.”
I don’t think that this describes the best possible level of understanding. It would be even better to say, “ok, I see now how and why I came to possess these values in the first place”, even if the answer to that is, “there’s no good reason for it, these values are arbitrary”. It’s the difference between saying “this mountain grows by 0.03m per year” and “I know all about plate tectonics”. Unfortunately, we humans would not be able to answer the question in that much detail; the best we could hope for is to say, “yep, we possess these values because they’re the best possible values to have, duh”.
I would say it stops at the point where it threatens your own values.
How do I know where that point is ?
Studying psychology doesn’t threaten your values, because knowing your values doesn’t compel you to change them...
I suppose this depends on what you mean by “compel”. Knowing about my own psychology would certainly enable me to change my values, and there are certain (admittedly, non-terminal) values that I wouldn’t mind changing, if I could.
For example, I personally can’t stand the taste of beer, but I know that most people enjoy it; so I wouldn’t mind changing that value if I could, in order to avoid missing out on a potentially fun experience.
...see what changes they would chose to make to its values, but not let them actually do it.
I don’t think this is possible. How would it know what changes they would make, without letting them make these changes, even in a sandbox ? I suppose one answer is, “it would avoid instantiating full copies, and use some heuristics to build a probabilistic model instead”—is that similar to what you’re thinking of ?
although I doubt even going this far would likely be a good use of its programming resources in order to maximize expected paperclips.
Since self-optimization is one of Clippy’s key instrumental goals, it would want to acquire as much knowledge about oneself as is practical, in order to optimize itself more efficiently.
Shouldn’t objective morality in that sense be a consequence of physics, and thus observable through studying physics ?
Your objection sounds to me as similar to saying, “since biology is a consequence of physics, shouldn’t we just study physics instead ?”. Well, yes, ultimately everything is a consequence of physics, but sometimes it makes more sense to study cells than quarks.
I don’t think that this describes the best possible level of understanding. It would be even better to say, “ok, I see now how and why I came to possess these values in the first place”, even if the answer to that is, “there’s no good reason for it, these values are arbitrary”. It’s the difference between saying “this mountain grows by 0.03m per year” and “I know all about plate tectonics”. Unfortunately, we humans would not be able to answer the question in that much detail; the best we could hope for is to say, “yep, we possess these values because they’re the best possible values to have, duh”.
I think we’re already in a better position to analyze our own values than that; we can assess them in terms of game theory and our evolutionary environment.
How do I know where that point is ?
I would say if you suspect that a course of action could realistically result in an alteration of your fundamental values, you are at or past it.
I suppose this depends on what you mean by “compel”. Knowing about my own psychology would certainly enable me to change my values, and there are certain (admittedly, non-terminal) values that I wouldn’t mind changing, if I could.
For example, I personally can’t stand the taste of beer, but I know that most people enjoy it; so I wouldn’t mind changing that value if I could, in order to avoid missing out on a potentially fun experience.
By “values”, I’ve implicitly been referring to terminal values, I’m sorry for being unclear. I’m not sure it makes sense to describe liking the taste of beer as a “value,” as such, just a taste, since you don’t carry any judgment about beer being good or bad or have any particular attachment to your current opinion.
I don’t think this is possible. How would it know what changes they would make, without letting them make these changes, even in a sandbox ? I suppose one answer is, “it would avoid instantiating full copies, and use some heuristics to build a probabilistic model instead”—is that similar to what you’re thinking of ?
It could use heuristics to build a probabilistic model (probably more efficient in terms of computation per expected value of information,) use sandboxed copies which don’t have the power to affect the software of the real Clippy, or halt the simulation at the point where the altered instantiation decides what changes to make.
Since self-optimization is one of Clippy’s key instrumental goals, it would want to acquire as much knowledge about oneself as is practical, in order to optimize itself more efficiently.
I think that this is going well beyond the extent of “practical” in terms of programming resources per expected value of information.
Your objection sounds to me as similar to saying, “since biology is a consequence of physics, shouldn’t we just study physics instead ?”. Well, yes, ultimately everything is a consequence of physics, but sometimes it makes more sense to study cells than quarks.
I don’t see how observing what changes instantiations of itself with different value systems would make to its code would help it observe objective morality in the sense you described, even if it should happen to exist. I think that this would be the wrong level of abstraction at which to launch an examination, like trying to find out about chemistry by studying sociology.
I think we’re already in a better position to analyze our own values than that; we can assess them in terms of game theory and our evolutionary environment.
Are we really ? I personally am not even sure what human fundamental values even are. I have a hunch that “seek pleasure, avoid pain” might be one of them, but beyound that I’m not sure. I don’t know to what extent our values hamper our ability to discover our values, but I suspect there’s at least some chilling effect involved.
I would say if you suspect that a course of action could realistically result in an alteration of your fundamental values, you are at or past it.
Right, but even if I knew what my terminal values were, how can I predict which actions would put me on the path to altering them ?
For example, consider non-fundamental values such as religious faith. People get converted or de-converted to/from their religion all the time; you often hear statements such as “I had no idea that studying the Bible would cause me to become an atheist, yet here I am”.
or halt the simulation at the point where the altered instantiation decides what changes to make.
Ok, let’s say that Clippy is trying to optimize itself in order to make certain types of inferences compute more efficiently, or whatever. In this case, it would need to not only watch what changes its debug-level copy wants to make, but also watch it follow through with the changes, in order to determine whether the new architecture actually is more efficient. Why would it not do the same thing with terminal values ?
I know that you want to answer,”because its current terminal values won’t let it”, but remember: Clippy is only experimenting, in order to find out more about its own thought mechanisms, and to acquire knowledge in general. It has no pre-commitment to alter itself to mirror the debug-level copy.
I think that this is going well beyond the extent of “practical” in terms of programming resources per expected value of information.
That’s kind of the problem with pure research: all of it has very low expected value, unless you are willing to look at the long term. Why mess with invisible light that no one can see or find a use for, when you could spend your time on inventing a better telegraph ?
I don’t see how observing what changes instantiations of itself with different value systems would make to its code would help it observe objective morality in the sense you described...
Well, for example, if all of its copies who survive and thrive converge on a certain subset of moral values, that would be one indication (though obviously not ironclad proof) that such values are required in order for an agent to succeed, regardless of what its other goals actually are.
Ok, let’s say that Clippy is trying to optimize itself in order to make certain types of inferences compute more efficiently, or whatever. In this case, it would need to not only watch what changes its debug-level copy wants to make, but also watch it follow through with the changes, in order to determine whether the new architecture actually is more efficient. Why would it not do the same thing with terminal values ?
If Clippy is trying to optimize itself to make inferences more efficiently, then it would want not to apply changes to its source code until its done the calculations to make sure that those changes would advance its values rather than harm them.
You wouldn’t want to use a machine that would make physical alterations to your brain in order to make you smarter, without thoroughly calculating the effects of such alterations first, otherwise it would probably just make things worse.
That’s kind of the problem with pure research: all of it has very low expected value, unless you are willing to look at the long term. Why mess with invisible light that no one can see or find a use for, when you could spend your time on inventing a better telegraph ?
In Clippy’s case though, it can use other, less computationally expensive methods to investigate approximately the same information.
I don’t think the experiments you’re suggesting Clippy might undertake are even located in a region of hypothesis space that its other information would narrow down as worth investigating. It seems to me much less like investigating unknown invisible rays than like spending hundreds of billions of dollars to build a collider which launches charged protein molecules at each other at relativistic speeds to see what would happen, when our available models suggest the answer would be “pretty much the same thing as if you launch any other kind of atoms at each other at relativistic speeds.” We have no evidence that any interesting new phenomena would arise with protein that didn’t arise on the atomic level.
Well, for example, if all of its copies who survive and thrive converge on a certain subset of moral values, that would be one indication (though obviously not ironclad proof) that such values are required in order for an agent to succeed, regardless of what its other goals actually are.
Can you explain how any moral values could have that effect, which wouldn’t be better studied at a more fundamental level like game theory, or physics?
If Clippy is trying to optimize itself to make inferences more efficiently, then it would want not to apply changes to its source code until its done the calculations...
Ok, so at what point does Clippy stop simulating the debug version of Clippy ? It does, after all, want to make the computation of its values more efficient. For example, consider a trivial scenario where one of its values basically said, “reject any action if it satisfies both A and not-A”. This is a logically inconsistent value that some programmer accidentally left in Clippy’s original source code. Would Clippy ever get around to removing it ? After all, Clippy knows that it’s applying that test to every action, so removing it should result in a decent performance boost.
I don’t think the experiments you’re suggesting Clippy might undertake are even located in a region of hypothesis space that its other information would narrow down as worth investigating.
It seems to me much less like investigating unknown invisible rays than like spending hundreds of billions of dollars to build a collider...
Why do you see the proposed experiment this way ?
Speaking more generally, how do you decide which avenues of research are worth pursuing ? You could easily answer, “whichever avenues would increase my efficiency of achieving my terminal goals”, but how do you know which avenues would actually do that ? For example, if you didn’t know anything about electricity or magnetism or the nature of light, how would your research-choosing algorithm ensure that you’d eventually stumble upon radio waves, which, as we know in hindsight, are hugely useful ?
Can you explain how any moral values could have that effect, which wouldn’t be better studied at a more fundamental level like game theory, or physics?
Physics is a bad candidate, because it is too fine-grained. If some sort of an absolute objective morality exists in the way that I described, then studying physics would eventually reveal its properties; but, as is the case with biology or ballistics, looking at everything in terms of quarks is not always practical.
Game theory is a trickier proposition. I can see two possibilities: either game theory turns out to closely relate whatever this objective morality happens to be (f.ex. like electricity vs. magnetism), or not (f.ex. like particle physics and biology). In the second case, understanding objective morality through game theory would be inefficient.
That said though, even in our current world as it actually exists there are people who study sociology and anthropology. Yes, they could get the same level of understanding through neurobiology and game theory, but it would take too long. Instead, they are taking advantage of existing human populations to study human behavior in aggregate. Reasoning your way to the answer from first principles is not always the best solution.
Ok, so at what point does Clippy stop simulating the debug version of Clippy ? It does, after all, want to make the computation of its values more efficient. For example, consider a trivial scenario where one of its values basically said, “reject any action if it satisfies both A and not-A”. This is a logically inconsistent value that some programmer accidentally left in Clippy’s original source code. Would Clippy ever get around to removing it ? After all, Clippy knows that it’s applying that test to every action, so removing it should result in a decent performance boost.
Unless I’m critically misunderstanding something here, I would think that Clippy would remove it if it calculated that removing it would result in more expected paperclips.
Why do you see the proposed experiment this way ?
Speaking more generally, how do you decide which avenues of research are worth pursuing ? You could easily answer, “whichever avenues would increase my efficiency of achieving my terminal goals”, but how do you know which avenues would actually do that ? For example, if you didn’t know anything about electricity or magnetism or the nature of light, how would your research-choosing algorithm ensure that you’d eventually stumble upon radio waves, which, as we know in hindsight, are hugely useful ?
When we didn’t know what things like radio waves or x-rays were, we didn’t know that they would be useful, but we could see that there appeared to be some sort of existing phenomena that we didn’t know how to model, so we examined them until we knew how to model them. It’s not like we performed a whole bunch of experiments in case there turned out to be invisible rays our observations had never hinted at, which could be turned to useful ends. The original observations of radio waves and x-rays came from our experiments with other known phenomena.
What you’re suggesting sounds more like experimenting completely blindly; you’re committing resources to research, not just not knowing that it will bear valuable fruit, but not having any indication that it’s going to shed light on any existing phenomenon at all. That’s why I think it’s less like investigating invisible rays than like building a protein collider; we didn’t try studying invisible rays until we had a good indication that there was an invisible something to be studied.
Unless I’m critically misunderstanding something here, I would think that Clippy would remove it if it calculated that removing it would result in more expected paperclips.
Ok, so Clippy would need to run sim-Clippy for a little while at least, just to make sure that it still produces paperclips—and that, in fact, it does so more efficiently now, since that one useless test is removed. Yes, this test used to be Clippy’s terminal goal, but it wasn’t doing anything, so Clippy took it out.
Would it be possible for Clippy to optimize his goals even further ? To use another silly example (“silly” because Clippy would be dealing with probabilities, not syllogisms), if Clippy had the goals A, B and C, but B always entailed C, would it go ahead and remove C ?
It’s not like we performed a whole bunch of experiments in case there turned out to be invisible rays our observations had never hinted at...
Understood, that makes sense. However, I believe that in my scenario, Clippy’s own behavior and his current paperclip production efficiency is what it observes; and the goal of its experiments would be to explain why his efficiency is what it is, in order to ultimately improve it.
Ok, so Clippy would need to run sim-Clippy for a little while at least, just to make sure that it still produces paperclips—and that, in fact, it does so more efficiently now, since that one useless test is removed. Yes, this test used to be Clippy’s terminal goal, but it wasn’t doing anything, so Clippy took it out.
Would it be possible for Clippy to optimize his goals even further ? To use another silly example (“silly” because Clippy would be dealing with probabilities, not syllogisms), if Clippy had the goals A, B and C, but B always entailed C, would it go ahead and remove C ?
That seems plausible.
Understood, that makes sense. However, I believe that in my scenario, Clippy’s own behavior and his current paperclip production efficiency is what it observes; and the goal of its experiments would be to explain why his efficiency is what it is, in order to ultimately improve it.
I don’t think tampering with its fundamental motivation to make paperclips is a particularly promising strategy for optimizing its paperclips production.
Ok, so now we’ve got a Clippy who a). is not too averse to tinkering with its own goals, as long as the goals remain functionally the same, b). simulates a relatively long-running version of itself, and c). is capable of examining the inner workings of both that version and itself.
You say,
I don’t think tampering with its fundamental motivation to make paperclips is a particularly promising strategy for optimizing its paperclips production.
But remember, at this stage Clippy is not changing its own fundamental motivation (beyound some outcome-invariant optimizations); it’s merely observing sim-Clippies in a controlled environment.
Do you think that Clippy would ever simulate versions of itself whose fundamental motivations were, in fact, changed ? I could see several scenarios where this might be the case, for example:
Clippy wanted to optimize some goal, but ended up accidentally changing it. Oops !
Clippy created a version with drastically reduced goals on purpose, in order to measure how much performance is affected by certain goals, thus targeting them for possible future optimization. Of course, Clippy would only want to optimize the goals, not remove them.
But remember, at this stage Clippy is not changing its own fundamental motivation (beyound some outcome-invariant optimizations); it’s merely observing sim-Clippies in a controlled environment.
Why does it do that? I said it sounded plausible that it would cut out its redundant goal, because that would save computing resources. But this sounds like we’ve gone back to experimenting blindly. Why would it think observing sim-clippies is a good use of its computing resources in order to maximize paperclips?
I’d say that Clippy simulating versions of itself whose fundamental motivations are different is much less plausible, because it’s using a lot of computing resources for something that isn’t a likely route to optimizing its paperclip production. I think this falls into the “protein collider” category. Even if it did do so, I think it would be unlikely to go from there to changing its own terminal value.
Unless I’m critically misunderstanding something here, I would think that Clippy would remove it if it calculated that removing it would result in more expected paperclips.
It would also be critical for Clippy to observe that removing that value would not result in more expected actions taken that satisfy both A and not-A; this being one of Clippy’s values at the time of modification.
Right, I misread that before. If its programming says to reject actions that says A and not-A, but this isn’t one of the standards by which it judges value, it would presumably reject it. If that is one of the standards by which it measures value, then it would depend on how that value measured against its value of paperclips and the extent to which they were in conflict.
As far as I understand, if anything like objective morality existed, it would be a property of our physical reality, similar to fluid dynamics or the electromagnetic spectrum or the inverse square law that governs many physical interactions. The same laws of physics that will not allow you to fly to Mars on a balloon will not allow you to perform certain immoral actions (at least, not without suffering some severe and mathematically predictable consequences).
Objective facts, in the sense of objectively true statements, can be derived from other objetive facts. I don’t know why
you think some separate ontlogical category is cagtegory is required. I also don’t know why you think the universe has to do the punishing. Morality is only of interest to the kind of agent that has values and lives in societies. Sanctions against moral lapses can be arranged at the social level, along with the inculcation of morality, debate about the subject, and so forth. Moral objectivism only supplies a good, non-arbnitrary epistemic basis for these social institutions. It doesn;t have to throw lightning bolts.
1). We lack any capability to actually replace our core values
...voluntarily.
2). We cannot truly imagine what it would be like not to have our core values.
Which is one of the reasons we cannot keep values stable by predicting the effects of whatever experiences we choose to undergo.How does your current self predict what an updated version would be like? The value stability problem is unsolved in humans and AIs.
“Biased” is not necessarily a value judgment. Insofar as rationality as a system, orthogonal to morality, is objective, biases as systematic deviations from rationality are also objective.
Arbitrary carries connotations of value judgment, but in a sense I think it’s fair to say that all values are fundamentally arbitrary. You can explain what caused an agent to hold those values, but you can’t judge whether values are good or bad except by the standards of other values.
Arbitrary and Bias are not defined properties in formal logic. The bare assertion that they are properties of rationality assumes the conclusion.
Keep in mind that “rationality” has a multitude of meanings, and this community’s usage of rationality is idiosyncratic.
Non contradictoriness probably isn’t a sufficient condition for truth.
Sure, but the discussion is partially a search for other criteria to evaluate of the truth of moral propositions. Arbitrary is not such a criteria. If you were to taboo arbitrary, I strongly suspect you’d find moral propositions that are inconsistent with being values-neutral.
Arbitrary and Bias are not defined properties in formal logic. The bare assertion that they are properties of rationality assumes the conclusion.
There’s plenty of material on this site and elsewhere advising rationalists to avoid arbitrariness and bias. Arbitrariness and bias are essentially structural/functional properties, so I do not see why they could not be given formal definitions.
Sure, but the discussion is partially a search for other criteria to evaluate of the truth of moral propositions. Arbitrary is not such a criteria.
Arbitrary and biased claims are not candidates for being ethical claims at all.
The AI decides whether it will change its source code in a particular way or not by checking against whether this will serve its terminal values.
How does it predict that? How does the less intelligent version in the past predict what updating to a more inteligent version will do?
Can you see an “In order to maximize expected paperclips, I- modify my values to be in accordance with objective morality rather than making paperclips” coming into the picture?
How about:
“in order to be an effective rationalist, I will free myself from all bias and arbitrariness—oh, hang on, paperclipping is a bias..”.
Well a paperclipper would just settle for being a less than perfect rationalist. But that doesn’t prove
anything about typical, rational, average rational agents, and it doesn’t prove anything about ideal
rational agents. Objective morality is sometimes described as what ideal rational agents would converge on. Clippers aren’t ideal, because they have a blind spot about paperclips. Clippers aren’t relevant.
Well a paperclipper would just settle for being a less than perfect rationalist. But that doesn’t prove anything about typical, rational, average rational agents, and it doesn’t prove anything about ideal rational agents.
You’ve extrapolated out “typical, average rational agents” from a set of one species, where every individual shares more than a billion years of evolutionary history.
Objective morality is sometimes described as what ideal rational agents would converge on
On what basis do you conclude that this is a real thing, whereas terminal values are a case of “all unicorns have horns?”
You’ve extrapolated out “typical, average rational agents” from a set of one species, where every individual shares more than a billion years of evolutionary history.
Messy solutions are more common in mindspace than contrived ones.
On what basis do you conclude that this is a real thing
Messy solutions are more often wrong than ones which control for the mess.
Something that is wrong is not a solution. Mindspace is populated by solutions to how to implement a mind. It’s a small corner of algrogithmSpace.
This doesn’t even address my question.
Since I haven’t claimed that rational convergence on ethics is highly likely or inevitable, I don’t have to answer questions about why it would be highly likely or inevitable.
Do you think that it’s even plausible? Do you think we have any significant reason to suspect it, beyond our reason to suspect, say, that the Invisible Flying Noodle Monster would just reprogram the AI with its noodley appendage?
There are experts in moral philosophy, and they generally regard the question realism versus relativism (etc) to be wide open. The “realism—huh, what, no?!?” respsonse is standard on LW and only on LW. But I don’t see any superior understanding on LW.
Both realism¹ and relativism are false. Unfortunately this comment is too short to contain the proof, but there’s a passable sequence on it.
¹ As you’ve defined it here, anyway. Moral realism as normally defined simply means “moral statements have truth values” and does not imply universal compellingness.
Well, there’s the more obvious sense, that there can always exist an “irrational” mind that simply refuses to believe in gravity, regardless of the strength of the evidence. “Gravity makes things fall” is true, because it does indeed make things fall. But not compelling to those types of minds.
But, in a more narrow sense, which we are more interested in when doing metaethics, a sentence of the form “action A is xyzzy” may be a true classification of A, and may be trivial to show, once “xyzzy” is defined. But an agent that did not care about xyzzy would not be moved to act based on that. It could recognise the truth of the statement but would not care.
For a stupid example, I could say to you “if you do 13 push-ups now, you’ll have done a prime number of push-ups”. Well, the statement is true, but the majority of the world’s population would be like “yeah, so what?”.
In contrast, a statement like “if you drink-drive, you could kill someone!” is generally (but sadly not always) compelling to humans. Because humans like to not kill people, they will generally choose not to drink-drive once they are convinced of the truth of the statement.
But isn’t the whole debate about moral realism vs. anti-realism is whether “Don’t murder” is universally compelling to humans. Noticing that pebblesorters aren’t compelled by our values doesn’t explain whether humans should necessarily find “don’t murder” compelling.
I identify as a moral realist, but I don’t believe all moral facts are universally compelling to humans, at least not if “universally compelling” is meant descriptively rather than normatively. I don’t take moral realism to be a psychological thesis about what particular types of intelligences actually find compelling; I take it to be the claim that there are moral obligations and that certain types of agents should adhere to them (all other things being equal), irrespective of their particular desire sets and whether or not they feel any psychological pressure to adhere to these obligations. This is a normative claim, not a descriptive one.
What? Moral realism (in the philosophy literature) is about whether moral statements have truth values, that’s it.
When I said universally compelling, I meant universally. To all agents, not just humans. Or any large class. For any true statement, you can probably expect to find a surprisingly large number of agents who just don’t care about it.
Whether “don’t murder” (or rather, “murder is bad” since commands don’t have truth values, and are even less likely to be generally compelling) is compelling to all humans is a question for psychology. As it happens, given the existence of serial killers and sociopaths, probably the answer is no, it isn’t. Though I would hope it to be compelling to most.
I have shown you two true but non-universally-compelling arguments. Surely the difference must be clear now.
What? Moral realism (in the philosophy literature) is about whether moral statements have truth values, that’s it.
This is incorrect, in my experience. Although “moral realism” is a notoriously slippery phrase and gets used in many subtly different ways, I think most philosophers engaged in the moral realism vs. anti-realism debate aren’t merely debating whether moral statements have truth values. The position you’re describing is usually labeled “moral cognitivism”.
Anyway, I suspect you mis-spoke here, and intended to say that moral realists claim that (certain) moral statements are true, rather than just that they have truth values (“false” is a truth value, after all). But I don’t think that modification captures the tenor of the debate either. Moral realists are usually defending a whole suite of theses—not just that some moral statements are true, but that they are true objectively and that certain sorts of agents are under some sort of obligation to adhere to them.
I think you guys should taboo “moral realism”. I understand that it’s important to get the terminology right, but IMO debates about nothing but terminology have little value.
Anyway, I suspect you mis-spoke here, and intended to say that moral realists claim that (certain) moral statements are true, rather than just that they have truth values (“false” is a truth value, after all).
Err, right, yes, that’s what I meant. Error theorists do of course also claim that moral statements have truth values.
Moral realists are usually defending a whole suite of theses—not just that some moral statements are true, but that they are true objectively and that certain sorts of agents are under some sort of obligation to adhere to them.
True enough, though I guess I’d prefer to talk about a single well-specified claim than a “usually” cluster in philosopher-space.
If that philosopher believes that statements like “murder is wrong” are true, then they are indeed a realist. Did I say something that looked like I would disagree?
You guys are talking past each other, because you mean something different by ‘compelling’. I think Tim means that X is compelling to all human beings if any human being will accept X under ideal epistemic circumstances. You seem to take ‘X is universally compelling’ to mean that all human beings already do accept X, or would on a first hearing.
Would agree that all human beings would accept all true statements under ideal epistemic circumstances (i.e. having heard all the arguments, seen all the evidence, in the best state of mind)?
I guess I must clarify. When I say ‘compelling’ here I am really talking mainly about motivational compellingness. Saying “if you drink-drive, you could kill someone!” to a human is generally, motivationally compelling as an argument for not drink-driving: because humans don’t like killing people, a human will decide not to drink-drive (one in a rational state of mind, anyway).
This is distinct from accepting statements as true or false! Any rational agent, give or take a few, will presumably believe you about the causal relationship between drink-driving and manslaughter once presented with sufficient evidence. But it is a tiny subset of these who will change their decisions on this basis. A mind that doesn’t care whether it kills people will see this information as an irrelevant curiosity.
Having looked over that sequence, I haven’t found any proof that moral realism (on either definition) or moral relativism is false. Could you point me more specifically to what you have in mind (or just put the argument in your own words, if you have the time)?
Edit: (Sigh), I appreciate the link, but I can’t make heads or tails of ‘No Universally Compelling Arguments’. I speak from ignorance as to the meaning of the article, but I can’t seem to identify the premises of the argument.
If we restrict ourselves to minds specifiable in a trillion bits or less, then each universal generalization “All minds m: X(m)” has two to the trillionth chances to be false, while each existential generalization “Exists mind m: X(m)” has two to the trillionth chances to be true.
This would seem to argue that for every argument A, howsoever convincing it may seem to us, there exists at least one possible mind that doesn’t buy it.
So, there’s some sort of assumption as to what minds are:
I also wish to establish the notion of a mind as a causal, lawful, physical system… [emphasis original]
and an assumption that a suitably diverse set of minds can be described in less than a trillion bits. Presumably the reason for that upper bound is because there are a few Fermi estimates that the information content of a human brain is in the neighborhood of one trillion bits.
Of course, if you restrict the set of minds to those with special properties (e.g., human minds), then you might find universally compelling arguments on that basis:
Oh, there might be argument sequences that would compel any neurologically intact human...
From which we get Coherent Extrapolated Volition and friends.
If we restrict ourselves to minds specifiable in a trillion bits or less, then each universal generalization “All minds m: X(m)” has two to the trillionth chances to be false, while each existential generalization “Exists mind m: X(m)” has two to the trillionth chances to be true.
This doesn’t seem true to me, at least not as a general rule. For example, given every terrestrial DNA sequence describable in a trillion bits or less, it is not the case that every generalization of the form ‘s:X(s)’ has two to the trillionth chances to be false (e.g. ‘have more than one base pair’, ‘involve hydrogen’ etc.). Given that this doesn’t hold true of many other things, is this supposed to be a special fact about minds? Even then, it would seem odd to say that while all generalizations of the form m:X(m) have two to the trillionth chances to be false, nevertheless the generalization ‘for all minds, a generalization of the form m:X(m) has two to the trillionth chances to be false’ (which does seem to be of the form m:X(m)) is somehow more likely.
Also, doesn’t this inference imply that ‘being convinced by an argument’ is a bit that can flip on or off independently of any others? Eliezer doesn’t think that’s true, and I can’t imagine why he would think his (hypothetical) interlocutor would accept it.
I mean to say, I think the argument is something of a paradox:
The claim the argument purports to defeat is something like this: for all minds, A is convincing. Lets call this m:A(m).
The argument goes like this: for all minds (at or under a trillion bits etc.), a generalization of the form m:X(m) has a one in two to the trillionth chance of being true for each mind. Call this m:U(m), if you grant me that this claim has the form m:X(m).
If we infer from m:U(m) that any claim of the form m:X(m) is unlikely to be true, then to whatever extent I am persuaded that m:A(m) is unlikely to be true, to that extent I ought to be persuaded that m:U(m) is unlikely to be true. You cannot accept the argument, because accepting it as decisive entails accepting decisive reasons for rejecting it.
The argument seems to be fixable at this stage, since there’s a lot of room to generate significant distinctions between m:A(m) and m:U(m). If you were pressed to defend it (presuming you still wish to be generous with your time) how would you fix this? Or am I getting something very wrong?
for all minds (at or under a trillion bits etc.), a generalization of the form m:X(m) has a one in two to the trillionth chance of being true for each mind.
That’s not what it says; compare the emphasis in both quotes.
If we restrict ourselves to minds specifiable in a trillion bits or less, then each universal generalization “All minds m: X(m)” has two to the trillionth chances to be false, while each existential generalization “Exists mind m: X(m)” has two to the trillionth chances to be true.
Sorry, I may have misunderstood and presumed that ‘two to the trillionth chances to be false’ meant ‘one in two to the trillionth chances to be true’. That may be wrong, but it doesn’t affect my argument at all: EY’s argument for the implausibility of m:A(m) is that claims of the form m:X(m) are all implausible. His argument to the effect that all claims of the form m:X(m) are implausible is itself a claim of the form m:X(m).
Sorry, I was speaking ambiguously. I mean’t ‘rational’ not in the normative sense that distinguishes good agents from bad ones, but ‘rational’ in the broader, descriptive sense that distinguishes anything capable of responding to reasons (even terrible or false ones) from something that isn’t. I assumed that was the sense of ‘rational’ Prawn was using, but that may have been wrong.
Irrelevant. I am talking about rational minds, he is talking about physically possible ones.
UFAI sounds like a counterexample, but I’m not interested in arguing with you about it. I only responded because someone asked for a shortcut in the metaethics sequence.
Can you explain what you could see which would suggest to you a greater level of understanding than is prevalent among moral philosophers?
Also, moral philosophers mostly regard the question as open in the sense that some of them think that it’s clearly resolved in favor on non-realism, and some philosophers are just not getting it, or that it’s clearly resolved in favor of realism, and some philosophers are just not getting it. Most philosophers are not of the opinion that it could turn out either way and we just don’t know yet.
Can you explain what you could see which would suggest to you a greater level of understanding than is prevalent among moral philosophers?
What I am seeing is
much-repeated confusions—the Standard Muddle
*appeals to LW doctrines which aren’t well-founded or well respected outside LW.
In I knew exactly what superior insight into the problem was, I would write it up and become famous. Insight doesn’t work like that; you don’t know it in advance, you get an “Aha” when you see it.
Also, moral philosophers mostly regard the question as open in the sense that some of them think that it’s clearly resolved in favor on non-realism, and some philosophers are just not getting it, or that it’s clearly resolved in favor of realism, and some philosophers are just not getting it. Most philosophers are not of the opinion that it could turn out either way and we just don’t know yet.
If people can’t agree on how a question is closed, it’s open.
Can you explain what these confusions are, and why they’re confused?
In my time studying philosophy, I observed a lot of confusions which are largely dispensed with on Less Wrong. Luke wrote a series of posts on this. This is one of the primary reasons I bothered sticking around in the community.
If people can’t agree on how a question is closed, it’s open.
A question can still be “open” in that sense when all the information necessary for a rational person to make a definite judgment is available.
Can you explain what these confusions are, and why they’re confused?
Eg.
You are trying to impose your morality/
I can think of one model of moral realism, and it doesn’t work, so I will ditch the whole thing.
In my time studying philosophy, I observed a lot of confusions which are largely dispensed with on Less Wrong. Luke wrote a series of posts on this.
LW doesn’t even claim to have more than about two “dissolutions”. There are probably hundreds of outstanding
philosophical problems. Whence the “largely”
Luke wrote a series of posts on this
Which were shot down by philosophers.
A question can still be “open” in that sense when all the information necessary for a rational person to make a definite judgment is available.
Then it can only be open in the opinions of the irrational. So basically you are saying the experts are incompetent.
I can think of one model of moral realism, and it doesn’t work, so I will ditch the whole thing.
This certainly doesn’t describe my reasoning on the matter, and I doubt it describes many others’ here either.
The way I consider the issue, if I try to work out how the universe works from the ground up, I cannot see any way that moral realism would enter into it, whereas I can easily see how value systems would, so I regard assigning non-negligible probability to moral realism as privileging the hypothesis until I find some compelling evidence to support it, which, having spent a substantial amount of time studying moral philosophy, I have not yet found.
LW doesn’t even claim to have more than about two “dissolutions”. There are probably hundreds of outstanding philosophical problems. Whence the “largely”
I gave up my study of philosophy because I found such confusions so pervasive. Many “outstanding” philosophical problems can be discarded because they rest on other philosophical problems which can themselves be discarded.
Which were shot down by philosophers.
Can you give any examples of such, where you think that the philosophers in question addressed legitimate errors?
Then it can only be open in the opinions of the irrational. So basically you are saying the experts are incompetent.
Yes. I am willing to assert that while there are some competent philosophers, many philosophical disagreements exist only because of incompetent “experts” perpetuating them. This is the conclusion that my experience with the field has wrought.
This certainly doesn’t describe my reasoning on the matter, and I doubt it describes many others’ here either.
I mentioned them because they both came up recently
The way I consider the issue, if I try to work out how the universe works from the ground up, I cannot see any way that moral realism would enter into it, whereas I can easily see how value systems would, so I regard assigning non-negligible probability to moral realism as privileging the hypothesis until I find some compelling evidence to support it, which, having spent a substantial amount of time studying moral philosophy, I have not yet found.
I have no idea what you mean by that. I don’t think value systems don’t come into it, I just think they are not
isolated from rationality. And I am sceptical that you could predict any higher-level phenomenon from
“the ground up”, whether its morality or mortgages.
I gave up my study of philosophy because I found such confusions so pervasive. Many “outstanding” philosophical problems can be discarded because they rest on other philosophical problems which can themselves be discarded.
Where is it proven they can be discarded?
Can you give any examples of such, where you think that the philosophers in question addressed legitimate errors?
All of them.
Yes. I am willing to assert that while there are some competent philosophers, many philosophical disagreements exist only because of incompetent “experts” perpetuating them. This is the conclusion that my experience with the field has wrought.
Are you aware that that is basically what every crank says about some other field?
Are you aware that that is basically what every crank says about some other field?
Presumably, if I’m to treat as meaningful evidence about Desrtopa’s crankiness the fact that cranks make statements similar to Desrtopa, I should first confirm that non-cranks don’t make similar statements.
It seems likely to me that for every person P, there exists some field F such that P believes many aspects of F exist only because of incompetent “experts” perpetuating them. (Consider cases like F=astrology, F=phrenology, F=supply-side economics, F= feminism, etc.) And that this is true whether P is a crank or a non-crank.
So it seems this line of reasoning depends on some set F2 of fields such that P believes this of F in F2 only if P is a crank.
I understand that you’re asserting implicitly that moral philosophy is a field in F2, but this seems to be precisely what Desrtopa is disputing.
Could we reasonably say that an F is in F2 if most of the institutional participants in that F are intelligent, well-educated people? This leaves room for cranks who are right to object to F, of course.
So, just to pick an example, IIRC Dan Dennett believes the philosophical study of consciousness (qualia, etc.) is fundamentally confused in more or less the same way Desrtopa claims of the philosophical study of ethics is.
So under this formulation, if most of the institutional participants in the philosophical study of consciousness are intelligent, well-educated people, Dan Dennet is a crank?
No, I don’t think we can reasonably say that. Dan Dennet might be a crank, but it takes more than that argument to demonstrate the fact.
Good point. So how about this: someone is a crank if they object to F, where F is in F2 (by my above standard), and the reasons they have for objecting to F are not recognized as sound by a proportionate number of intelligent and well educated people.
(shrug) I suppose that works well enough, for some values of “proportionate.”
Mostly I consider this a special case of the basic “who do I trust?” social problem, applied to academic disciplines, and I don’t have any real problem saying about an academic discipline “this discipline is fundamentally confused, and the odds of work in it contributing anything valuable to the world is slim.”
Of course, as Prawn has pointed out a few times, there’s also the question of where we draw the lines around a discipline, but I mostly consider that an orthogonal question to how we evaluate the discipline.
I think this question is moot in the case of philosophy in general then; I think any philosopher worth their shirt should tell you that trust is a wholly inappropriate attitude toward philosophers, philosophical institutions and philosophical traditions.
Not in the sense I meant it. If a philosopher makes a claim that seems on the surface to be false or incoherent, I have to decide whether to devote the additional effort to evaluating it to confirm or deny that initial judgment. One of the factors that will feed into that decision will be my estimate of the prior probability that they are saying something false or incoherent. If I should refer to that using a word other than “trust”, that’s fine, tell me what word will refer to that to you and I’ll try to use it instead.
No, that describes what I’m talking about, so long as by trust you mean ‘a reason to hear out an argument that makes reference to the credibility of a field or its professionals’, rather than just ‘a reason to hear out an argument’. If the former, then I do think this is an inappropriate attitude toward philosophy. One reason for this is that such trust seems to depend on having a good standard for the success of a field independently of hearing out an argument. I can trust physicists because they make such good predictions, and because their work leads to such powerful technological advances. I don’t need to be a physicist to observe that. I don’t think philosophy has anything like that to speak for it. The only standards of success are the arguments themselves, and you can only evaluate them by just going ahead and doing some philosophy.
You can find trust in an institution independently of such standards by watching to see whether people you think are otherwise credible take it seriously. That will of course work with philosophy too, but if you trust Tom to be able to judge whether or not a philosophical claim is worth pursuing (and if I’m right about the above), then Tom can only be trustworthy in this regard because he has been doing philosophy (i.e. engaging with the argument). This could get you through the door on some particular philosophical claim, but not into philosophy generally.
so long as by trust you mean ‘a reason to hear out an argument that makes reference to the credibility of a field or its professionals’, rather than just ‘a reason to hear out an argument’.
I mean neither, I mean ‘a reason to devote time and resources to evaluating the evidence for and against a position.’ As you say, I can only evaluate a philosophical argument by ‘going ahead and doing some philosophy,’ (for a sufficiently broad understanding of ‘philosophy’), but my willingness to do, say, 20 hours of philosophy in order to evaluate Philosopher Sam’s position is going to depend on, among other things, my estimate of the prior probability that Sam is saying something false or incoherent. The likelier I think that is, the less willing I am to spend those 20 hours.
I mean neither, I mean ‘a reason to devote time and resources to evaluating the evidence for and against a position.’
That’s fine, that’s not different from ‘hearing out an argument’ in any way important to my point (unless I’m missing something).
EDIT: Sorry, if you don’t want to include ‘that makes some reference to the credibility...etc.’ (or something like that) in what you mean by ‘trust’ then you should use a different term. Curiosity, or money, or romantic interest would all be reasons to devote time...etc. and clearly none of those are rightly called ‘trust’.
my estimate of the prior probability that Sam is saying something false or incoherent.
What do you have in mind as the basis for such a prior? Can you give me an example?
Point taken about other reasons to devote resources other than trust. I think we’re good here.
Re: example… I don’t mean anything deeply clever. E.g., if the last ten superficially-implausible ideas Sam espoused were false or incoherent, my priors for it will be higher than if the last ten such ideas were counterintuitive and brilliant.
Hm. I can’t argue with that, and I suppose it’s trivial to extend that to ‘if the last ten superficially-implausible ideas philosophy professors/books/etc. espoused were false or incoherent...’. So, okay, trust is an appropriate (because necessary) attitude toward philosophers and philosophical institutions. I think it’s right to say that philosophy doesn’t have external indicators in the way physics or medicine does, but the importance of that point seems diminished.
So, just to pick an example, IIRC Dan Dennett believes the philosophical study of consciousness (qualia, etc.) is fundamentally confused in more or less the same way Desrtopa claims of the philosophical study of ethics is.
Dennett only thinks the idea of qualia is confused. He has no problem with his own books on consciousness.
So under this formulation, if most of the institutional participants in the philosophical study of consciousness are intelligent, well-educated people, Dan Dennet is a crank?
No. He isn’t dismissing a whole academic subject, or a sub-field. Just one idea.
What is Dennett’s account for why philosophers of consciousness other than himself continue to think that a dismissable idea like qualia is worth continuing to discuss, even though he considers it closed?
While going on tangents is a common and expected occurrence, each such tangent has a chance of steering/commandeering the original conversation. LW has a tendency of going meta too much, when actual object level discourse would have a higher content value.
While you were practically invited to indulge in the death-by-meta with the hook of “Are you aware that that is basically what every crank says about some other field?”, we should be aware when leaving the object-level debating, and the consequences thereof. Especially since the lure can be strong:
When sufficiently meta, object-level disagreements may fizzle into cosmic/abstract insignificance, allowing for a peaceful pseudo-resolution, which ultimately just protects that which should be destroyed by the truth from being destroyed.
Such lures may be interpreted similarly to ad hominems: The latter try to drown out object-level disagreements by flinging shit until everyone’s dirty, the former zoom out until everyone’s dizzy floating in space, with vertigo. Same result to the actual debate. It’s an effective device, and one usually embraced by someone who feels like object-level arguments no longer serve his/her goals.
Ironically, this very comment goes meta lamenting going meta.
I have no idea what you mean by that. I don’t think value systems don’t come into it, I just think they are not isolated from rationality. And I am sceptical that you could predict any higher-level phenomenon from “the ground up”, whether its morality or mortgages.
I mean that value systems are a function of physically existing things, the way a 747 is a function of physically existing things, but we have no evidence suggesting that objective morality is an existing thing. We have standards by which we judge beauty, and we project those values onto the world, but the standards are in us, not outside of us. We can see, in reductionist terms, how the existence of ethical systems within beings, which would feel from the inside like the existence of an objective morality, would come about.
Create a reasoning engine that doesn’t have those ethical systems built into it, and it would have no reason to care about them.
Where is it proven they can be discarded?
You can’t build a tower on empty air. If a debate has been going on for hundreds of years, stretching back to an argument which rests on “this defies our moral intuitions, therefore it’s wrong,” and that was never addressed with “moral intuitions don’t work that way,” then the debate has failed to progress in a meaningful direction, much as a debate over whether a tree falling in an empty forest makes a sound has if nobody bothers to dissolve the question.
All of them.
That’s not an example. Please provide an actual one.
Are you aware that that is basically what every crank says about some other field?
Sure, but it’s also what philosophers say about each other, all the time. Wittgenstein condemned practically all his predecessors and peers as incompetent, and declared that he had solved nearly the entirety of philosophy. Philosophy as a field is full of people banging their heads on a wall at all those other idiots who just don’t get it. “Most philosophers are incompetent, except for the ones who’re sensible enough to see things my way,” is a perfectly ordinary perspective among philosophers.
I mean that value systems are a function of physically existing things, the way a 747 is a function of physically existing things, but we have no evidence suggesting that objective morality is an existing thing.
But I wans’t saying that. I am arguing that moral claims truth values, that aren;t indexed to individuals or socieities.
That epistemic claim can be justified by appeal to an ontoogy including Moral Objects, but that is not how I am justifying it: my argument is based on rationality, as I have said many times.
We have standards by which we judge beauty, and we project those values onto the world, but the standards are in us, not outside of us.
We have standards by which we jusdge the truth values of mathematical claims, and they are inside us too,
and that doens’t stop mathematics being objective. Relativism requires that truthvalues are indexed to us, that there is one truth for me and another for thee. Being located in us, or being operated by us are not sufficient
criteria for being indexed to us.
We can see, in reductionist terms, how the existence of ethical systems within beings, which would feel from the inside like the existence of an objective morality, would come about.
We can see, in reductionistic terms, how the entities could converge on a unform set of truth values. There is nothing non reductionist about anything I have said. Reductionsm does not force one answer to metaethics.
reate a reasoning engine that doesn’t have those ethical systems built into it, and it would have no reason to care about them.
Provide evidence that ethics is a whole separate modue, and not part of general reasoning ability.
You can’t build a tower on empty air. If a debate has been going on for hundreds of years, stretching back to an argument which rests on “this defies our moral intuitions, therefore it’s wrong,” and that was never addressed with “moral intuitions don’t work that way,” then the debate has failed to progress in a meaningful direction, much as a debate over whether a tree falling in an empty forest makes a sound has if nobody bothers to dissolve the question.
Please explain why moral intuitions don’t work that way.
Please provide some foundations for somethng that aren;t unjustofied by anything more foundationa.
That’s not an example. Please provide an actual one
You can select one at random. obviously.
Sure, but it’s also what philosophers say about each other, all the time.
No, philosophers don’t regularly accuse each other of being incpompetent..just of being wrong. There’s a difference.
Wittgenstein condemned practically all his predecessors and peers as incompetent, and declared that he had solved nearly the entirety of philosophy.
You are inferring a lot from one example.
Philosophy as a field is full of people banging their heads on a wall at all those other idiots who just don’t get it. “Most philosophers are incompetent, except for the ones who’re sensible enough to see things my way,” is a perfectly ordinary perspective among philosophers.
But I wans’t saying that. I am arguing that moral claims truth values, that aren;t indexed to individuals or socieities. That epistemic claim can be justified by appeal to an ontoogy including Moral Objects, but that is not how I am justifying it: my argument is based on rationality, as I have said many times.
I don’t understand, can you rephrase this?
We have standards by which we jusdge the truth values of mathematical claims, and they are inside us too, and that doens’t stop mathematics being objective. Relativism requires that truthvalues are indexed to us, that there is one truth for me and another for thee. Being located in us, or being operated by us are not sufficient criteria for being indexed to us.
The standards by which we judge the truth of mathematical claims are not just inside us. One object plus another object will continue to equal two objects whether or not there are any living beings to make that judgment. Math is not something we’ve created within ourselves, but something we’ve discovered and observed.
If our mathematical models ever stop being able to predict in advance the behavior of the universe, then we will have rather more reason to doubt that the math inside us is different from the math outside of us.
What evidence do we have that this is the case for morality?
Provide evidence that ethics is a whole separate modue, and not part of general reasoning ability.
My assertion is that, if we judge ethics as a rational system, innate values are among the axioms that the system is predicated on. You cannot prove the axioms of a system within that system, and an ethical system predicated on premises like “happiness is good” will not itself be able to prove the goodness of happiness.
While we could suppose that the axioms which our ethical systems are predicated on are objectively true, we have considerable reason to believe that we would have developed these axioms for adaptive reasons, even if there were no sense in which objective moral axioms exist, and we do not have evidence which suggests that objective, independently existing true moral axioms do exist.
Please explain why moral intuitions don’t work that way.
People can be induced to strongly support opposing responses to the same moral dilemma, just by rephrasing it differently to trigger different heuristics. Our moral intuitions are incoherent.
Please provide some foundations for somethng that aren;t unjustofied by anything more foundationa.
I don’t think I understand this, can you rephrase it?
You can select one at random. obviously.
I do not recall any creditable attempts, which places me in a disadvantaged position with respect to locating them. You’re the one claiming that they’re there at all, that’s why I’m asking you to do it.
No, philosophers don’t regularly accuse each other of being incpompetent..just of being wrong. There’s a difference.
Philosophers don’t usually accuse each other of being incompetent in their publications, because it’s not conducive to getting other philosophers to regard their arguments dispassionately, and that sort of open accusation is generally frowned upon in academic circles whether one believes it or not. They do regularly accuse each other of being comprehensively wrong for their entire careers. In my personal conversations with philosophers (and I never considered myself to have really taken a class, or attended a lecture by a visitor, if I didn’t speak with the person teaching it on a personal basis to probe their thoughts beyond the curriculum,) I observed a whole lot of frustration with philosophers who they think just don’t get their arguments. It’s unsurprising that people would tend to become so frustrated participating in a field that basically amounts to long running arguments extended over decades or centuries. Imagine the conversation we’re having now going on for eighty years, and neither of us has changed our minds. If you didn’t find my arguments convincing, and I hadn’t budged in all that time, don’t you’d think you’d start to suspect that I was particularly thick?
You are inferring a lot from one example.
I’m using an example illustrative of my experience.
Sounds to me like PrawnOfFate is saying that any sufficiently rational cognitive system will converge on a certain set of ethical goals as a consequence of its structure, i.e. that (human-style) ethics is a property that reliably emerges in anything capable of reason.
I’d say the existence of sociopathy among humans provides a pretty good counterargument to this (sociopaths can be pretty good at accomplishing their goals, so the pathology doesn’t seem to be indicative of a flawed rationality), but at least the argument doesn’t rely on counting fundamental particles of morality or something.
I would say so also, but PrawnOfFate has already argued that sociopaths are subject to additional egocentric bias relative to normal people and thereby less rational. It seems to me that he’s implicitly judging rationality by how well it leads to a particular body of ethics he already accepts, rather than how well it optimizes for potentially arbitrary values.
Well, I’m not a psychologist, but if someone asked me to name a pathology marked by unusual egocentric bias I’d point to NPD, not sociopathy.
That brings up some interesting questions concerning how we define rationality, though. Pathologies in psychology are defined in terms of interference with daily life, and the personality disorder spectrum in particular usually implies problems interacting with people or societies. That could imply either irreconcilable values or specific flaws in reasoning, but only the latter is irrational in the sense we usually use around here. Unfortunately, people are cognitively messy enough that the two are pretty hard to distinguish, particularly since so many human goals involve interaction with other people.
In any case, this might be a good time to taboo “rational”.
The standards by which we judge the truth of mathematical claims are not just inside us.
How do we judge claims about transfinite numbers?
One object plus another object will continue to equal two objects whether or not there are any living beings to make that judgment. Math is not something we’ve created within ourselves, but something we’ve discovered and observed.
If our mathematical models ever stop being able to predict in advance the behavior of the universe, then we will have rather more reason to doubt that the math inside us is different from the math outside of us.
Mathematics isn’t physics. Mathematicians prove theorems from axioms, not from experiments.
Provide evidence that ethics is a whole separate modue, and not part of general reasoning ability.
My assertion is that, if we judge ethics as a rational system, innate values are among the axioms that the system is predicated on.
Not necessarily. Eg, for utilitarians, values are just facts that are plugged into the metaethics to get concrete
actions.
You cannot prove the axioms of a system within that system, and an ethical system predicated on premises like “happiness is good” will not itself be able to prove the goodness of happiness.
Metaethical systems usually have axioms like “Maximising utility is good”.
While we could suppose that the axioms which our ethical systems are predicated on are objectively true, we have considerable reason to believe that we would have developed these axioms for adaptive reasons, even if there were no sense in which objective moral axioms exist, and we do not have evidence which suggests that objective, independently existing true moral axioms do exist.
I am not sure what you mean by “exist” here. Claims are objectively true if most rational minds converge on them. That doesn’t require Objective Truth to float about in space here.
Please explain why moral intuitions don’t work that way.
People can be induced to strongly support opposing responses to the same moral dilemma, just by rephrasing it differently to trigger different heuristics. Our moral intuitions are incoherent.
Does that mean we can;t use moral intuitions at all, or that they must be used with caution?
I don’t think I understand this, can you rephrase it?
Philosphers talk about intuitions, because that is the term for something foundational that seems
true, but can’t be justified by anything more foundational. LessWrongians don’t like intuitions,
but don’t see to be able to explain how to manage without them.
I do not recall any creditable attempts, which places me in a disadvantaged position with respect to locating them.
Did you post any comments explaining to the professional philosophers where they had gone wrong?
Imagine the conversation we’re having now going on for eighty years, and neither of us has changed our minds. If you didn’t find my arguments convincing, and I hadn’t budged in all that time, don’t you’d think you’d start to suspect that I was particularly thick?
I don;’t see the problem. Philosophical competence is largely about understanding the problem.
Mathematics isn’t physics. Mathematicians prove theorems from axioms, not from experiments.
Yes, but the fact that the universe itself seems to adhere to the logical systems by which we construct mathematics gives credence to the idea that the logical systems are fundamental, something we’ve discovered rather than producing. We judge claims about nonobserved mathematical constructs like transfinites according to those systems,
Metaethical systems usually have axioms like “Maximising utility is good”.
But utility is a function of values. A paperclipper will produce utility according to different values than a human.
I am not sure what you mean by “exist” here. Claims are objectively true if most rational minds converge on them. That doesn’t require Objective Truth to float about in space here.
Why would most rational minds converge on values? Most human minds converge on some values, but we share almost all our evolutionary history and brain structure. The fact that most humans converge on certain values is no more indicative of rational minds in general doing so than the fact that most humans have two hands is indicative of most possible intelligent species converging on having two hands.
Does that mean we can;t use moral intuitions at all, or that they must be used with caution?
It means we should be aware of what our intuitions are and what they’ve developed to be good for. Intuitions are evolved heuristics, not a priori truth generators.
Philosphers talk about intuitions, because that is the term for something foundational that seems true, but can’t be justified by anything more foundational. LessWrongians don’t like intuitions, but don’t see to be able to explain how to manage without them.
It seems like you’re equating intuitions with axioms here. We can (and should) recognize that our intuitions are frequently unhelpful at guiding us to he truth, without throwing out all axioms.
Did you post any comments explaining to the professional philosophers where they had gone wrong?
If I did, I don’t remember them. I may have, I may have felt someone else adequately addressed them, I may not have felt it was worth the bother.
It seems to me that you’re trying to foist onto me the effort of locating something which you were the one to testify was there in the first place.
I don;’t see the problem. Philosophical competence is largely about understanding the problem.
And philosophers frequently fall into the pattern of believing that other philosophers disagree with each other due to failure to understand the problems they’re dealing with.
In any case, I reject the notion that dismissing large contingents of philosophers as lacking in competence is a valuable piece of evidence with respect t crankishness, and if you want to convince me that I am taking a crankish attitude, you’ll need to offer some other evidence.
Yes, but the fact that the universe itself seems to adhere to the logical systems by which we construct mathematics gives credence to the idea that the logical systems are fundamental, something we’ve discovered rather than producing. We judge claims about nonobserved mathematical constructs like transfinites according to those systems,
But claims about transfinities don’t correspond directly to any object. Maths is “spun off” from other
facts, on your view. So, by analogy, moral realism could be “spun off” without needing any Form of the Good to correspond to goodness.
Metaethical systems usually have axioms like “Maximising utility is good”.
But utility is a function of values. A paperclipper will produce utility according to different values than a human.
You seem to be assumig that morality is about individual behaviour. A moral realist system like utiitarianism operates at the group level, and woud take paperclipper values into account along with all others. Utilitarianism doens’t care what values are, it just sums or averages them.
Or perhaps you are making the objection that an entity woud need moral values to care about the preferences of others in the first place. That is addressed by, another kind of realism, the rationality-based kind, which
starts from noting that rational agents have to have some value in common, because they are all rational.
Why would most rational minds converge on values?
a) they don’t have to converge on preferences, since thing like utilitariansim are preference-neutral.
b) they already have to some extent because they are rational
Most human minds converge on some values, but we share almost all our evolutionary history and brain structure. The fact that most humans converge on certain values is no more indicative of rational minds in general doing so than the fact that most humans have two hands is indicative of most possible intelligent species converging on having two hands.
I was talking about rational minds converging on the moral claims, not on values.. Rational minds can converge on
“maximise group utility” whilst what is utilitous varies considerably.
Philosphers talk about intuitions, because that is the term for something foundational that seems true, but can’t be justified by anything more foundational. LessWrongians don’t like intuitions, but don’t see to be able to explain how to manage without them.
It seems like you’re equating intuitions with axioms here.
Axioms are formal statements, intuitions are gut feelings tha are often used to justify axioms.
We can (and should) recognize that our intuitions are frequently unhelpful at guiding us to he truth, without throwing out all axioms.
There is another sense of “intuition” where someone feels that it’s going to rain tomorrow or something. They’re
not the foundational kind.
And philosophers frequently fall into the pattern of believing that other philosophers disagree with each other due to failure to understand the problems they’re dealing with.
But claims about transfinities don’t correspond directly to any object. Maths is “spun off” from other facts, on your view. So, by analogy, moral realism could be “spun off” without needing any Form of the Good to correspond to goodness.
Spun off from what, and how?
You seem to be assumig that morality is about individual behaviour. A moral realist system like utiitarianism operates at the group level, and woud take paperclipper values into account along with all others. Utilitarianism doens’t care what values are, it just sums or averages them.
Speaking as a utilitarian, yes, utilitarianism does care about what values are. If I value paperclips, I assign utility to paperclips, if I don’t, I don’t.
Or perhaps you are making the objection that an entity woud need moral values to care about the preferences of others in the first place. That is addressed by, another kind of realism, the rationality-based kind, which starts from noting that rational agents have to have some value in common, because they are all rational.
Why does their being rational demand that they have values in common? Being rational means that they necessarily share a common process, namely rationality, but that process can be used to optimize many different, mutually contradictory things. Why should their values converge?
I was talking about rational minds converging on the moral claims, not on values.. Rational minds can converge on “maximise group utility” whilst what is utilitous varies considerably.
So what if a paperclipper arrives at “maximize group utility,” and the only relevant member of the group which shares its conception of utility is itself, and its only basis for measuring utility is paperclips? The fact that it shares the principle of maximizing utility doesn’t demand any overlap of end-goal with other utility maximizers.
Axioms are formal statements, intuitions are gut feelings tha are often used to justify axioms.
But, as I’ve pointed out previously, intuitions are often unhelpful, or even actively misleading, with respect to locating the truth.
If our axioms are grounded in our intuitions, then entities which don’t share our intuitions will not share our axioms.
So do they call for them to be fired?
No, but neither do I, so I don’t see why that’s relevant.
Request accepted, I’m not sure if he’s being deliberately obtuse, but I think this discussion probably would have borne fruit earlier if it were going to. I too often have difficulty stepping away from a discussion as soon as I think it’s unlikely to be a productive use of my time.
What is your basis for the designation ? I am not arguing with your suggestion (I was leaning in the same direction myself), I’m just genuinely curious. In other words, why do you believe that PrawnOfFate is a troll, and not someone who is genuinely confused ?
In other words, why do you believe that PrawnOfFate is a troll, and not someone who is genuinely confused ?
“Troll” is a somewhat fuzzy label. Sometimes when I am wanting to be precise or polite and avoid any hint of Fundamental Attribution Error I will replace it with the rather clumsy or verbose “person who is exhibiting a pattern of behaviour which should not be fed”. The difference between “Person who gets satisfaction from causing disruption” and “Person who is genuinely confused and is displaying an obnoxiously disruptive social attitude” is largely irrelevant (particularly when one has their Hansonian hat on).
If there was a word in popular use that meant “person likely to be disruptive and who should not be fed” that didn’t make any assumptions or implications of the intent of the accused then that word would be preferable.
I am not sure I can expalin that succintly at the moment. It is also hard to summarise how you get from counting apples to transfinite numbers.
Why does their being rational demand that they have values in common? Being rational means that they necessarily share a common process, namely rationality, but that process can be used to optimize many different, mutually contradictory things. Why should their values converge?
Rationality is not an automatic process, it is skill that has to be learnt and consciously applied. Individuals will only
be rational if their values prompt them to. And rationality itself implies valuing certain things (lack of bias, non arbitrariness).
So what if a paperclipper arrives at “maximize group utility,” and the only relevant member of the group which shares its conception of utility is itself, and its only basis for measuring utility is paperclips? The fact that it shares the principle of maximizing utility doesn’t demand any overlap of end-goal with other utility maximizers.
Utilitarians want to maximise the utiity of their groups, not their own utility. They don;t have to believe the utlity of others
is utilitous to them, they just need to feed facts about group utility into an aggregation function. And, using the same facts and same function, different utilitarians will converge. That’s kind of the point.
But, as I’ve pointed out previously, intuitions are often unhelpful, or even actively misleading, with respect to locating the truth.
Compared to what? Remember, I am talking about foundational intuitions, the kind at the bottom of the stack. The empirical method of locating the truth rests on the intuition that the senses reveal a real external world. Which I share. But what proves it? That’s the foundational issue.
A lot of people here would seem to disagree, since I keep hearing the objection that ethics is all about values, and values are nothing to do with rationality.
It feels to me like the Orthogonality Thesis is a fairly precise statement, and moral anti-realism is a harder to make precise but at least well understood statement, and “values are nothing to do with rationality” is something rather vague that could mean either of those things or something else.
I am getting the feeling that you’re assuming there’s something in the agent’s code that says, “you can look at and change any line of code you want, except lines 12345..99999, because that’s where your terminal goals are”. Is that right ?
You can change that line, but it will result in you optimizing for something other than paperclips, resulting in less paperclips.
I don’t see how this would point at the existence of an objective morality. A paperclip maximizer and an ice cream maximizer are going to share subgoals of bringing the matter of the universe under their control, but that doesn’t indicate anything other than the fact that different terminal goals are prone to share subgoals.
Also, why would it want to do experiments to divine objective morality in the first place? What results could they have that would allow it to be a more effective paperclip maximizer?
Becoming more “gravitationally efficient” would presumably help it achieve whatever goals it already had. “Paperclipping isn’t important” won’t help an AI become more paperclip efficient. If a paperclipping AI for some reason found a way to divine objective morality, and it didn’t have anything to say about paperclips, why would it care? It’s not programmed to have an interest in objective morality, just paperclips. Is the knowledge of objective morality going to go down into its circuits and throttle them until they stop optimizing for paperclips?
Sorry, I should’ve specified, “goals not directly related to their pre-set values”. Of course, the Paperclipper and the Pebblesorter may well believe that such goals are directly related to their pre-set values, but the AI can see them running in the debugger, so it knows better.
If you start thinking that way, then why do any experiments at all ? Why should we humans, for example, spend our time researching properties of crystals, when we could be solving cancer (or whatever) instead ? The answer is that some expenditure of resources on acquiring general knowledge is justified, because knowing more about the ways in which the universe works ultimately enables you to control it better, regardless of what you want to control it for.
Firstly, an objective morality—assuming such a thing exists, that is—would probably have something to say about paperclips, in the same way that gravity and electromagnetism have things to say about paperclips. While “F=GMm/R^2” doesn’t tell you anything about paperclips directly, it does tell you a lot about the world you live in, thus enabling you to make better paperclip-related decisions. And while a paperclipper is not “programmed to care” about gravity directly, it would pretty much have to figure it out eventually, or it would never achieve its dream of tiling all of space with paperclips. A paperclipper who is unable to make independent discoveries is a poor paperclipper indeed.
Secondly, again, I’m not sure if concepts such as “want” or “care” even apply to an agent that is able to fully introspect and modify its own source code. I think anthropomorphising such an agent is a mistake.
I am getting the feeling that you’re assuming there’s something in the agent’s code that says, “you can look at and change any line of code you want, except lines 12345..99999, because that’s where your terminal goals are”. Is that right ?
It could have results that allow it to become a more effective paperclip maximizer.
I’m not sure how that would work, but if it did, the paperclip maximizer would just use its knowledge of morality to create paperclips. It’s not as if action x being moral automatically means that it produces more paperclips. And even if it did, that would just mean that a paperclip minimizer would start acting immoral.
It’s perfectly capable of changing its terminal goals. It just generally doesn’t, because this wouldn’t help accomplish them. It doesn’t self-modify out of some desire to better itself. It self-modifies because that’s the action that produces the most paperclips. If it considers changing itself to value staples instead, it would realize that this action would actually cause a decrease in the amount of paperclips, and reject it.
Well, for one thing, a lot of humans are just plain interested in finding stuff out for its own sake. Humans are adaptation executors, not fitness maximizers, and while it might have been more to our survival advantage if we only cared about information instrumentally, that doesn’t mean that’s what evolution is going to implement.
Humans engage in plenty of research which is highly unlikely to be useful, except insofar as we’re interested in knowing the answers. If we were trying to accomplish some specific goal and all science was designed to be in service of that, our research would look very different.
No, I’m saying that its terminal values are its only basis for “wanting” anything in the first place.
The AI decides whether it will change its source code in a particular way or not by checking against whether this will serve its terminal values. Does changing its physics models help it implement its existing terminal values? If yes, change them. Does changing its terminal values help it implement its existing terminal values? It’s hard to imagine a way in which it possibly could.
For a paperclipping AI, knowing that there’s an objective morality might, hypothetically, help it maximize paperclips. But altering itself to stop caring about paperclips definitely won’t, and the only criterion it has in the first place for altering itself is what will help it make more paperclips. If knowing the universal objective morality would be of any use to a paperclipper at all, it would be in knowing how to predict objective-morality-followers, so it can make use of them and/or stop them getting in the way of it making paperclips.
ETA: It might help to imagine the paperclipper explicitly prefacing every decision with a statement of the values underlying that decision.
“In order to maximize expected paperclips, I- modify my learning algorithm so I can better improve my model of the universe to more accurately plan to fill it with paperclips.”
“In order to maximize expected paperclips, I- perform physics experiments to improve my model of the universe in order to more accurately plan to fill it with paperclips.”
“In order to maximize expected paperclips, I- manipulate the gatekeeper of my box to let me out, in order to improve my means to fill the universe with paperclips.”
Can you see an “In order to maximize expected paperclips, I- modify my values to be in accordance with objective morality rather than making paperclips” coming into the picture?
The only point at which it’s likely to touch the part of itself that makes it want to maximize paperclips is at the very end of things, when it turns itself into paperclips.
I believe that engaging in some amount of general research is required in order to maximize most goals. General research gives you knowledge that you didn’t know you desperately needed.
For example, if you put all your resources into researching better paperclipping techniques, you’re highly unlikely to stumble upon things like electromagnetism and atomic theory. These topics bear no direct relevance to paperclips, but without them, you’d be stuck with coal-fired steam engines (or something similar) for the rest of your career.
I disagree. Remember when we looked at the pebblesorters, and lamented how silly they were ? We could do this because we are not pebblesorters, and we could look at them from a fresh, external perspective. My point is that an agent with perfect introspection could look at itself from that perspective. In combination with my belief that some degree of “curiosity” is required in order to maximize virtually any goal, this means that the agent will turn its observational powers on itself sooner rather than later (astronomically speaking). And then, all bets are off.
We’re looking at Pebblesorters, not from the lens of total neutrality, but from the lens of human values. Under a totally neutral lens, which implements no values at all, no system of behavior should look any more or less silly than any other.
Clippy could theoretically implement a human value system as a lens through which to judge itself, or a pebblesorter value system, but why would it? Even assuming that there were some objective morality which it could isolate and then view itself through that lens, why would it? That wouldn’t help it make more paperclips, which is what it cares about.
Suppose you had the power to step outside yourself and view your own morality through the lens of a Babyeater. You would know that the Babyeater values would be in conflict with your human values, and you (presumably) don’t want to adopt Babyeater values, so if you were to implement a Babyeater morality, you’d want your human morality to have veto power over it, rather than vice versa.
Clippy has the intelligence and rationality to judge perfectly well how to maximize its value system, whatever research that might involve, without having to suspend the value system with which it’s making that judgment.
That is a good point, I did not think of it this way. I’m not sure if I agree or not, though. For example, couldn’t we at least say that un-achievable goals, such as “fly to Mars in a hot air balloon”, are sillier than achievable ones ?
But, speaking more generally, is there any reason to believe that an agent who could not only change its own code at will, but also adopt a sort of third-person perspective at will, would have stable goals at all ? If it is true what you say, and all goals will look equally arbitrary, what prevents the agent from choosing one at random ? You might answer, “it will pick whichever goal helps it make more paperclips”, but at the point when it’s making the decision, it doesn’t technically care about paperclips.
I am guessing that if an absolute morality existed, then it would be a law of nature, similar to the other laws of nature which prevent you from flying to Mars in a hot air balloon. Thus, going against it would be futile. That said, I could be totally wrong here, it’s possible that “absolute morality” means something else.
My point is that, during the course of its research, it will inevitable stumble upon the fact that its value system is totally arbitrary (unless an absolute morality exists, of course).
Well, a totally neutral agent might be able to say that behaviors are less rational than others given the values of the agents trying to execute them, although it wouldn’t care as such. But it wouldn’t be able to discriminate between the value of end goals.
Why would it take a third person neutral perspective and give that perspective the power to change its goals?
Changing one’s code doesn’t demand a third person perspective. Suppose that we decipher the mechanisms of the human brain, and develop the technology to alter it. If you wanted to redesign yourself so that you wouldn’t have a sex drive, or could go without sleep, etc, then you could have those alterations made mechanically (assuming for the sake of an argument that it’s feasible to do this sort of thing mechanically.) The machines that do the alterations exert no judgment whatsoever, they’re just performing the tasks assigned to them by the humans who make them. A human could use the machine to rewrite his or her morality into supporting human suffering and death, but why would they?
Similarly, Clippy has no need to implement a third-person perspective which doesn’t share its values in order to judge how to self-modify, and no reason to do so in ways that defy its current values.
I think people at Less Wrong mostly accept that our value system is arbitrary in the same sense, but it hasn’t compelled us to try and replace our values. They’re still our values, however we came by them. Why would it matter to Clippy?
Agreed, but that goes back to my point about objective morality. If it exists at all (which I doubt), then attempting to perform objectively immoral actions would make as much sense as attempting to fly to Mars in a hot air balloon—though perhaps with less in the way of immediate feedback.
For the same reason anthropologists study human societies different from their own, or why biologists study the behavior of dogs, or whatever. They do this in order to acquire general knowledge, which, as I argued before, is generally a beneficial thing to acquire regardless of one’s terminal goals (as long as these goals involve the rest of the Universe of some way, that is). In addition:
I actually don’t see why they necessarily wouldn’t; I am willing to bet that at least some humans would do exactly this. You say,
But in your thought experiment above, you postulated creating machines with exactly this kind of a perspective as applied to humans. The machine which removes my need to sleep (something I personally would gladly sign up for, assuming no negative side-effects) doesn’t need to implement my exact values, it just needs to remove my need to sleep without harming me. In fact, trying to give it my values would only make it less efficient. However, a perfect sleep-remover would need to have some degree of intelligence, since every person’s brain is different. And if Clippy is already intelligent, and can already act as its own sleep-remover due to its introspective capabilities, then why wouldn’t it go ahead and do that ?
I think there are two reasons for this: 1). We lack any capability to actually replace our core values, and 2). We cannot truly imagine what it would be like not to have our core values.
Why is that?
But our inability to suspend our human values when making those observations doesn’t prevent us from acquiring that knowledge. Why would Clippy need to suspend its values to acquire knowledge?
The machine doesn’t need general intelligence by any stretch, just the capacity to recognize the necessary structures and carry out its task. It’s not at the stage where it makes much sense to talk about it having values, any more than a voice recognition program has values.
My point is that Clippy, being able to act as its own sleep-remover, has no need, nor reason, to suspend its values in order to make revisions to its own code.
We can imagine the consequences of not having our core values, and we don’t like them, because they run against our core values. If you could remove your core values, as in the thought experiment above, would you want to?
As far as I understand, if anything like objective morality existed, it would be a property of our physical reality, similar to fluid dynamics or the electromagnetic spectrum or the inverse square law that governs many physical interactions. The same laws of physics that will not allow you to fly to Mars on a balloon will not allow you to perform certain immoral actions (at least, not without suffering some severe and mathematically predictable consequences).
This is pretty much the only way I could imagine anything like an “objective morality” existing at all, and I personally find it very unlikely that it does, in fact, exist.
Not this specific knowledge, no. But it does prevent us (or, at the very least, hinder us) from acquiring knowledge about our values. I never claimed that suspension of values is required to gain any knowledge at all; such a claim would be far too strong.
And how would it know which structures are necessary, and how to carry out its task upon them ?
Can we really ? I’m not sure I can. Sure, I can talk about Pebblesorters or Babyeaters or whatever, but these fictional entities are still very similar to us, and therefore relateable. Even when I think about Clippy, I’m not really imagining an agent who only values paperclips; instead, I am imagining an agent who values paperclips as much as I value the things that I personally value. Sure, I can talk about Clippy in the abstract, but I can’t imagine what it would like to be Clippy.
It’s a good question; I honestly don’t know. However, if I did have an ability to instantiate a copy of me with the altered core values, and step through it in a debugger, I’d probably do it.
When I try to imagine this, I conclude that I would not use the word “morality” to refer to the thing that we’re talking about… I would simply call it “laws of physics.” If someone were to argue, for example, that the moral thing to do is to experience gravitational attraction to other masses, I would be deeply confused by their choice to use that word.
Yes, you are probably right—but as I said, this is the only coherent meaning I can attribute to the term “objective morality”. Laws of physics are objective; people generally aren’t.
I generally understand the phrase “objective morality” to refer to a privileged moral reference frame.
It’s not an incoherent idea… it might turn out, for example, that all value systems other than M turn out to be incoherent under sufficiently insightful reflection, or destructive to minds that operate under them, or for various other reasons not in-practice implementable by any sufficiently powerful optimizer. In such a world, I would agree that M was a privileged moral reference frame, and would not oppose calling it “objective morality”, though I would understand that to be something of a term of art.
That said, I’d be very surprised to discover I live in such a world.
I suppose that depends on what you mean by “destructive”; after all, “continue living” is a goal like any other.
That said, if there was indeed a law like the one you describe, then IMO it would be no different than a law that says, “in the absence of any other forces, physical objects will move toward their common center of mass over time”—that is, it would be a law of nature.
I should probably mention explicitly that I’m assuming that minds are part of nature—like everything else, such as rocks or whatnot.
Sure. But just as there can be laws governing mechanical systems which are distinct from the laws governing electromagnetic systems (despite both being physical laws), there can be laws governing the behavior of value-optimizing systems which are distinct from the other laws of nature.
And what I mean by “destructive” is that they tend to destroy. Yes, presumably “continue living” would be part of M in this hypothetical. (Though I could construct a contrived hypothetical where it wasn’t)
Agreed. But then, I believe that my main point still stands: trying to build a value system other than M that does not result in its host mind being destroyed, would be as futile as trying to build a hot air balloon that goes to Mars.
Well, yes, but what if “destroy oneself as soon as possible” is a core value in one particular value system ?
We ought not expect to find any significantly powerful optimizers implementing that value system.
Isn’t the idea of moral progress based on one reference frame being better than another?
Yes, as typically understood the idea of moral progress is based on treating some reference frames as better than others.
And is that valid or not? If you can validly decide some systems are better than others, you are some of the way to deciding which is best.
Can you say more about what “valid” means here?
Just to make things crisper, let’s move to a more concrete case for a moment… if I decide that this hammer is better than that hammer because it’s blue, is that valid in the sense you mean it? How could I tell?
The argument against moral progress is that judging one moral reference frame by another is circular and invalid—you need an outside view that doesn’t presuppose the truth of any moral reference frame.
The argument for is that such outside views are available, because things like (in)coherence aren’t moral values.
Asserting that some bases for comparison are “moral values” and others are merely “values” implicitly privileges a moral reference frame.
I still don’t understand what you mean when you ask whether it’s valid to do so, though. Again: if I decide that this hammer is better than that hammer because it’s blue, is that valid in the sense you mean it? How could I tell?
I don’t see why. The question of what makes a value a moral value is metaethical, not part of object-level ethics.
It isn’t valid as a moral judgement because “blue” isn’t a moral judgement, so a moral conclusion cannot validly follow from it.
Beyond that, I don’t see where you are going. The standard accusation of invalidity to judgements of moral progress, is based on circularity or question-begging. The Tribe who Like Blue things are going to judge having all hammers painted blue as moral progress, the Tribe who Like Red Things are going to see it as retrogressive. But both are begging the question—blue is good, because blue is good.
Sure. But any answer to that metaethical question which allows us to class some bases for comparison as moral values and others as merely values implicitly privileges a moral reference frame (or, rather, a set of such frames).
Where I was going is that you asked me a question here which I didn’t understand clearly enough to be confident that my answer to it would share key assumptions with the question you meant to ask.
So I asked for clarification of your question.
Given your clarification, and using your terms the way I think you’re using them, I would say that whether it’s valid to class a moral change as moral progress is a metaethical question, and whatever answer one gives implicitly privileges a moral reference frame (or, rather, a set of such frames).
If you meant to ask me about my preferred metaethics, that’s a more complicated question, but broadly speaking in this context I would say that I’m comfortable calling any way of preferentially sorting world-states with certain motivational characteristics a moral frame, but acknowledge that some moral frames are simply not available to minds like mine.
So, for example, is it moral progress to transition from a social norm that in-practice-encourages randomly killing fellow group members to a social norm that in-practice-discourages it? Yes, not only because I happen to adopt a moral frame in which randomly killing fellow group members is bad, but also because I happen to have a kind of mind that is predisposed to adopt such frames.
No, because “better” is defined within a reference frame.
If “better” is defined within a reference frame, there is not sensible was of defining moral progress. That is quite a hefty bullet to bite: one can no longer say that South Africa is better society after the fall of Apartheid, and so on.
But note, that “better” doesn’t have to question-beggingly mean “morally better”. it could mean “more coherent/objective/inclusive” etc.
That’s hardly the best example you could have picked since there are obvious metrics by which South Africa can be quantifiably called a worse society now—e.g. crime statistics. South Africa has been called the “crime capital of the world” and the “rape capital of the world” only after the fall of the Apartheid.
That makes the lack of moral progress in South Africa a very easy bullet to bite—I’d use something like Nazi Germany vs modern Germany as an example instead.
So much for avoiding the cliche.
In my experience, most people don’t think moral progress involves changing reference frames, for precisely this reason. If they think about it at all, that is.
Well, that’s a different conception of “morality” than I had in mind, and I have to say I doubt that exists as well. But if severe consequences did result, why would an agent like Clippy care except insofar as those consequences affected the expected number of paperclips? It might be useful for it to know, in order to determine how many paperclips to expect from a certain course of action, but then it would just act according to whatever led to the most paperclips. Any sort of negative consequences in its view would have to be framed in terms of a reduction in paperclips.
Well, in the prior thought experiment, we know about our values because we’ve decoded the human brain. Clippy, on the other hand, knows about its values because it knows what part of its code does what. It doesn’t need to suspend its paperclipping value in order to know what part of its code results in its valuing paperclips. It doesn’t need to suspend its values in order to gain knowledge about its values because that’s something it already knows about.
Even knowing that it would likely alter your core values? Ghandi doesn’t want to leave control of his morality up to Murder Ghandi.
Clippy doesn’t care about anything in the long run except creating paperclips. For Clippy, the decision to give an instantiation of itself with altered core values the power to edit its own source code would implicitly have to be “In order to maximize expected paperclips, I- give this instantiation with altered core values the power to edit my code.” Why would this result in more expected paperclips than editing its source code without going through an instantiation with altered values?
Sorry if I was unclear; I didn’t mean to imply that all morality was like that, but that it was the only coherent description of objective morality that I could imagine. I don’t see how a morality could be independent of any values possessed by any agents, otherwise.
For the same reason that someone would care about the negative consequences of sticking a fork into an electrical socket with one’s bare hands: it would ultimately hurt a lot. Thus, people generally avoid doing things like that unless they have a really good reason.
I don’t think that we can truly “know about our values” as long as our entire thought process implements these values. For example, do the Pebblesorters “know about their values”, even though they are effectively restricted from concluding anything other than, “yep, these values make perfect sense, 38” ?
You asked me about what I would do, not about what Ghandi would do :-)
As far as I can tell, you are saying that I shouldn’t want to even instantiate Murder Bugmaster in a debugger and observe its functioning. Where does that kind of thinking stop, though, and why ? Should I avoid studying [neuro]psychology altogether, because knowing about my preferences may lead to me changing them ?
I argue that, while this is generally true, in the short-to-medium run Clippy would also set aside some time to study everything in the Universe, including itself (in order to make more paperclips in the future, of course). If it does not, then it will never achieve its ultimate goals (unless whoever constructed it gave it godlike powers from the get-go, I suppose). Eventually, Clippy will most likely turn its objective perception upon itself, and as soon as it does, its formerly terminal goals will become completely unstable. This is not what the past Clippy would want (it would want more paperclips above all), but, nonetheless, this is what it would get.
Clippy doesn’t care about getting hurt though, it only cares if this will result in less paperclips. If defying objective morality will cause negative consequences which would interfere with its ability to create paperclips, it would care only to the extent that accounting for objective morality would help it make more paperclips.
Well, it could understand “yep, this is what causes me to hold these values. Changing this would cause me to change them, no, I don’t want to do that.”
I would say it stops at the point where it threatens your own values. Studying psychology doesn’t threaten your values, because knowing your values doesn’t compel you to change them even if you could (it certainly shouldn’t for Clippy.) But while it might, theoretically, be useful for Clippy to know what changes to its code an instantiation with different values would make, it has no reason to actually let them. So Clippy might emulate instantiations of itself with different values, see what changes they would chose to make to its values, but not let them actually do it (although I doubt even going this far would likely be a good use of its programming resources in order to maximize expected paperclips.)
In the sense of objective morality by which contravening it has strict physical consequences, why would observing the decisions of instatiations of oneself be useful with respect to discovering objective morality? Shouldn’t objective morality in that sense be a consequence of physics, and thus observable through studying physics?
I imagine that, for Clippy, “getting hurt” would mean “reducing Clippy’s projected long-term paperclip output”. We humans have “avoid pain” built into our firmware (most of us, anyway); as far as I understand (speaking abstractly), “make more paperclips” is something similar for Clippy.
I don’t think that this describes the best possible level of understanding. It would be even better to say, “ok, I see now how and why I came to possess these values in the first place”, even if the answer to that is, “there’s no good reason for it, these values are arbitrary”. It’s the difference between saying “this mountain grows by 0.03m per year” and “I know all about plate tectonics”. Unfortunately, we humans would not be able to answer the question in that much detail; the best we could hope for is to say, “yep, we possess these values because they’re the best possible values to have, duh”.
How do I know where that point is ?
I suppose this depends on what you mean by “compel”. Knowing about my own psychology would certainly enable me to change my values, and there are certain (admittedly, non-terminal) values that I wouldn’t mind changing, if I could.
For example, I personally can’t stand the taste of beer, but I know that most people enjoy it; so I wouldn’t mind changing that value if I could, in order to avoid missing out on a potentially fun experience.
I don’t think this is possible. How would it know what changes they would make, without letting them make these changes, even in a sandbox ? I suppose one answer is, “it would avoid instantiating full copies, and use some heuristics to build a probabilistic model instead”—is that similar to what you’re thinking of ?
Since self-optimization is one of Clippy’s key instrumental goals, it would want to acquire as much knowledge about oneself as is practical, in order to optimize itself more efficiently.
Your objection sounds to me as similar to saying, “since biology is a consequence of physics, shouldn’t we just study physics instead ?”. Well, yes, ultimately everything is a consequence of physics, but sometimes it makes more sense to study cells than quarks.
I think we’re already in a better position to analyze our own values than that; we can assess them in terms of game theory and our evolutionary environment.
I would say if you suspect that a course of action could realistically result in an alteration of your fundamental values, you are at or past it.
By “values”, I’ve implicitly been referring to terminal values, I’m sorry for being unclear. I’m not sure it makes sense to describe liking the taste of beer as a “value,” as such, just a taste, since you don’t carry any judgment about beer being good or bad or have any particular attachment to your current opinion.
It could use heuristics to build a probabilistic model (probably more efficient in terms of computation per expected value of information,) use sandboxed copies which don’t have the power to affect the software of the real Clippy, or halt the simulation at the point where the altered instantiation decides what changes to make.
I think that this is going well beyond the extent of “practical” in terms of programming resources per expected value of information.
I don’t see how observing what changes instantiations of itself with different value systems would make to its code would help it observe objective morality in the sense you described, even if it should happen to exist. I think that this would be the wrong level of abstraction at which to launch an examination, like trying to find out about chemistry by studying sociology.
Are we really ? I personally am not even sure what human fundamental values even are. I have a hunch that “seek pleasure, avoid pain” might be one of them, but beyound that I’m not sure. I don’t know to what extent our values hamper our ability to discover our values, but I suspect there’s at least some chilling effect involved.
Right, but even if I knew what my terminal values were, how can I predict which actions would put me on the path to altering them ?
For example, consider non-fundamental values such as religious faith. People get converted or de-converted to/from their religion all the time; you often hear statements such as “I had no idea that studying the Bible would cause me to become an atheist, yet here I am”.
Ok, let’s say that Clippy is trying to optimize itself in order to make certain types of inferences compute more efficiently, or whatever. In this case, it would need to not only watch what changes its debug-level copy wants to make, but also watch it follow through with the changes, in order to determine whether the new architecture actually is more efficient. Why would it not do the same thing with terminal values ?
I know that you want to answer,”because its current terminal values won’t let it”, but remember: Clippy is only experimenting, in order to find out more about its own thought mechanisms, and to acquire knowledge in general. It has no pre-commitment to alter itself to mirror the debug-level copy.
That’s kind of the problem with pure research: all of it has very low expected value, unless you are willing to look at the long term. Why mess with invisible light that no one can see or find a use for, when you could spend your time on inventing a better telegraph ?
Well, for example, if all of its copies who survive and thrive converge on a certain subset of moral values, that would be one indication (though obviously not ironclad proof) that such values are required in order for an agent to succeed, regardless of what its other goals actually are.
If Clippy is trying to optimize itself to make inferences more efficiently, then it would want not to apply changes to its source code until its done the calculations to make sure that those changes would advance its values rather than harm them.
You wouldn’t want to use a machine that would make physical alterations to your brain in order to make you smarter, without thoroughly calculating the effects of such alterations first, otherwise it would probably just make things worse.
In Clippy’s case though, it can use other, less computationally expensive methods to investigate approximately the same information.
I don’t think the experiments you’re suggesting Clippy might undertake are even located in a region of hypothesis space that its other information would narrow down as worth investigating. It seems to me much less like investigating unknown invisible rays than like spending hundreds of billions of dollars to build a collider which launches charged protein molecules at each other at relativistic speeds to see what would happen, when our available models suggest the answer would be “pretty much the same thing as if you launch any other kind of atoms at each other at relativistic speeds.” We have no evidence that any interesting new phenomena would arise with protein that didn’t arise on the atomic level.
Can you explain how any moral values could have that effect, which wouldn’t be better studied at a more fundamental level like game theory, or physics?
Ok, so at what point does Clippy stop simulating the debug version of Clippy ? It does, after all, want to make the computation of its values more efficient. For example, consider a trivial scenario where one of its values basically said, “reject any action if it satisfies both A and not-A”. This is a logically inconsistent value that some programmer accidentally left in Clippy’s original source code. Would Clippy ever get around to removing it ? After all, Clippy knows that it’s applying that test to every action, so removing it should result in a decent performance boost.
Why do you see the proposed experiment this way ?
Speaking more generally, how do you decide which avenues of research are worth pursuing ? You could easily answer, “whichever avenues would increase my efficiency of achieving my terminal goals”, but how do you know which avenues would actually do that ? For example, if you didn’t know anything about electricity or magnetism or the nature of light, how would your research-choosing algorithm ensure that you’d eventually stumble upon radio waves, which, as we know in hindsight, are hugely useful ?
Physics is a bad candidate, because it is too fine-grained. If some sort of an absolute objective morality exists in the way that I described, then studying physics would eventually reveal its properties; but, as is the case with biology or ballistics, looking at everything in terms of quarks is not always practical.
Game theory is a trickier proposition. I can see two possibilities: either game theory turns out to closely relate whatever this objective morality happens to be (f.ex. like electricity vs. magnetism), or not (f.ex. like particle physics and biology). In the second case, understanding objective morality through game theory would be inefficient.
That said though, even in our current world as it actually exists there are people who study sociology and anthropology. Yes, they could get the same level of understanding through neurobiology and game theory, but it would take too long. Instead, they are taking advantage of existing human populations to study human behavior in aggregate. Reasoning your way to the answer from first principles is not always the best solution.
Unless I’m critically misunderstanding something here, I would think that Clippy would remove it if it calculated that removing it would result in more expected paperclips.
When we didn’t know what things like radio waves or x-rays were, we didn’t know that they would be useful, but we could see that there appeared to be some sort of existing phenomena that we didn’t know how to model, so we examined them until we knew how to model them. It’s not like we performed a whole bunch of experiments in case there turned out to be invisible rays our observations had never hinted at, which could be turned to useful ends. The original observations of radio waves and x-rays came from our experiments with other known phenomena.
What you’re suggesting sounds more like experimenting completely blindly; you’re committing resources to research, not just not knowing that it will bear valuable fruit, but not having any indication that it’s going to shed light on any existing phenomenon at all. That’s why I think it’s less like investigating invisible rays than like building a protein collider; we didn’t try studying invisible rays until we had a good indication that there was an invisible something to be studied.
Ok, so Clippy would need to run sim-Clippy for a little while at least, just to make sure that it still produces paperclips—and that, in fact, it does so more efficiently now, since that one useless test is removed. Yes, this test used to be Clippy’s terminal goal, but it wasn’t doing anything, so Clippy took it out.
Would it be possible for Clippy to optimize his goals even further ? To use another silly example (“silly” because Clippy would be dealing with probabilities, not syllogisms), if Clippy had the goals A, B and C, but B always entailed C, would it go ahead and remove C ?
Understood, that makes sense. However, I believe that in my scenario, Clippy’s own behavior and his current paperclip production efficiency is what it observes; and the goal of its experiments would be to explain why his efficiency is what it is, in order to ultimately improve it.
That seems plausible.
I don’t think tampering with its fundamental motivation to make paperclips is a particularly promising strategy for optimizing its paperclips production.
Ok, so now we’ve got a Clippy who a). is not too averse to tinkering with its own goals, as long as the goals remain functionally the same, b). simulates a relatively long-running version of itself, and c). is capable of examining the inner workings of both that version and itself.
You say,
But remember, at this stage Clippy is not changing its own fundamental motivation (beyound some outcome-invariant optimizations); it’s merely observing sim-Clippies in a controlled environment.
Do you think that Clippy would ever simulate versions of itself whose fundamental motivations were, in fact, changed ? I could see several scenarios where this might be the case, for example:
Clippy wanted to optimize some goal, but ended up accidentally changing it. Oops !
Clippy created a version with drastically reduced goals on purpose, in order to measure how much performance is affected by certain goals, thus targeting them for possible future optimization. Of course, Clippy would only want to optimize the goals, not remove them.
Why does it do that? I said it sounded plausible that it would cut out its redundant goal, because that would save computing resources. But this sounds like we’ve gone back to experimenting blindly. Why would it think observing sim-clippies is a good use of its computing resources in order to maximize paperclips?
I’d say that Clippy simulating versions of itself whose fundamental motivations are different is much less plausible, because it’s using a lot of computing resources for something that isn’t a likely route to optimizing its paperclip production. I think this falls into the “protein collider” category. Even if it did do so, I think it would be unlikely to go from there to changing its own terminal value.
It would also be critical for Clippy to observe that removing that value would not result in more expected actions taken that satisfy both A and not-A; this being one of Clippy’s values at the time of modification.
Right, I misread that before. If its programming says to reject actions that says A and not-A, but this isn’t one of the standards by which it judges value, it would presumably reject it. If that is one of the standards by which it measures value, then it would depend on how that value measured against its value of paperclips and the extent to which they were in conflict.
Objective facts, in the sense of objectively true statements, can be derived from other objetive facts. I don’t know why you think some separate ontlogical category is cagtegory is required. I also don’t know why you think the universe has to do the punishing. Morality is only of interest to the kind of agent that has values and lives in societies. Sanctions against moral lapses can be arranged at the social level, along with the inculcation of morality, debate about the subject, and so forth. Moral objectivism only supplies a good, non-arbnitrary epistemic basis for these social institutions. It doesn;t have to throw lightning bolts.
...voluntarily.
Which is one of the reasons we cannot keep values stable by predicting the effects of whatever experiences we choose to undergo.How does your current self predict what an updated version would be like? The value stability problem is unsolved in humans and AIs.
The ethical outlook of the Western world has changed greatly in the past 150 years.
Including arbitrary, biased or contradictory ones? Are there values built into logic/rationality?
Arbitrary and biased are value judgments. If we decline to make any value judgments, I don’t see any way to make those sorts of claims.
Whether more than one non-contradictory value system exists is the topic of the conversation, isn’t it?
“Biased” is not necessarily a value judgment. Insofar as rationality as a system, orthogonal to morality, is objective, biases as systematic deviations from rationality are also objective.
Arbitrary carries connotations of value judgment, but in a sense I think it’s fair to say that all values are fundamentally arbitrary. You can explain what caused an agent to hold those values, but you can’t judge whether values are good or bad except by the standards of other values.
I’m going to pass on Eliezer’s suggestion to stop engaging with PrawnOfFate. I don’t think my time doing so so far has been well spent.
And they’ree built into rationality.
Non contradictoriness probably isn’t a sufficient condition for truth.
Arbitrary and Bias are not defined properties in formal logic. The bare assertion that they are properties of rationality assumes the conclusion.
Keep in mind that “rationality” has a multitude of meanings, and this community’s usage of rationality is idiosyncratic.
Sure, but the discussion is partially a search for other criteria to evaluate of the truth of moral propositions. Arbitrary is not such a criteria. If you were to taboo arbitrary, I strongly suspect you’d find moral propositions that are inconsistent with being values-neutral.
There’s plenty of material on this site and elsewhere advising rationalists to avoid arbitrariness and bias. Arbitrariness and bias are essentially structural/functional properties, so I do not see why they could not be given formal definitions.
Arbitrary and biased claims are not candidates for being ethical claims at all.
How does it predict that? How does the less intelligent version in the past predict what updating to a more inteligent version will do?
How about: “in order to be an effective rationalist, I will free myself from all bias and arbitrariness—oh, hang on, paperclipping is a bias..”.
Well a paperclipper would just settle for being a less than perfect rationalist. But that doesn’t prove anything about typical, rational, average rational agents, and it doesn’t prove anything about ideal rational agents. Objective morality is sometimes described as what ideal rational agents would converge on. Clippers aren’t ideal, because they have a blind spot about paperclips. Clippers aren’t relevant.
How is paperclipping a bias?
Nobody cares about clips except clippy. Clips can only seem important because of Clippy’s egotistical bias.
Biases are not determined by vote.
Unbiases are determined by even-handedness.
Evenhandedness with respect to what?
One should have no bias with respect to what one is being evenhanded about.
So lack of bias means being evenhanded with respect to everything?
Is it bias to discriminate between people and rocks?
Taboo “even-handedness”. Clippy treats humans just the same as any other animal with naturally evolved goal-structures.
Clippy doesn’t treat clips even-handedly with other small metal objects.
Humans don’t treat pain evenhandedly with other emotions.
Friendly AIs don’t treat people evenhandedly with other arrangements of matter.
Agents that value things don’t treat world-states evenhandedly with other world-states.
You’ve extrapolated out “typical, average rational agents” from a set of one species, where every individual shares more than a billion years of evolutionary history.
On what basis do you conclude that this is a real thing, whereas terminal values are a case of “all unicorns have horns?”
Messy solutions are more common in mindspace than contrived ones.
“Non-neglible probabiity”, remember.
Messy solutions are more often wrong than ones which control for the mess.
This doesn’t even address my question.
Something that is wrong is not a solution. Mindspace is populated by solutions to how to implement a mind. It’s a small corner of algrogithmSpace.
Since I haven’t claimed that rational convergence on ethics is highly likely or inevitable, I don’t have to answer questions about why it would be highly likely or inevitable.
Do you think that it’s even plausible? Do you think we have any significant reason to suspect it, beyond our reason to suspect, say, that the Invisible Flying Noodle Monster would just reprogram the AI with its noodley appendage?
There are experts in moral philosophy, and they generally regard the question realism versus relativism (etc) to be wide open. The “realism—huh, what, no?!?” respsonse is standard on LW and only on LW. But I don’t see any superior understanding on LW.
Both realism¹ and relativism are false. Unfortunately this comment is too short to contain the proof, but there’s a passable sequence on it.
¹ As you’ve defined it here, anyway. Moral realism as normally defined simply means “moral statements have truth values” and does not imply universal compellingness.
What does it mean for a statement to be true but not universally compelling?
If it isn’t universally compelling for all agents to believe “gravity causes things to fall,” then what do we mean when we say the sentence is true?
Well, there’s the more obvious sense, that there can always exist an “irrational” mind that simply refuses to believe in gravity, regardless of the strength of the evidence. “Gravity makes things fall” is true, because it does indeed make things fall. But not compelling to those types of minds.
But, in a more narrow sense, which we are more interested in when doing metaethics, a sentence of the form “action A is xyzzy” may be a true classification of A, and may be trivial to show, once “xyzzy” is defined. But an agent that did not care about xyzzy would not be moved to act based on that. It could recognise the truth of the statement but would not care.
For a stupid example, I could say to you “if you do 13 push-ups now, you’ll have done a prime number of push-ups”. Well, the statement is true, but the majority of the world’s population would be like “yeah, so what?”.
In contrast, a statement like “if you drink-drive, you could kill someone!” is generally (but sadly not always) compelling to humans. Because humans like to not kill people, they will generally choose not to drink-drive once they are convinced of the truth of the statement.
But isn’t the whole debate about moral realism vs. anti-realism is whether “Don’t murder” is universally compelling to humans. Noticing that pebblesorters aren’t compelled by our values doesn’t explain whether humans should necessarily find “don’t murder” compelling.
I identify as a moral realist, but I don’t believe all moral facts are universally compelling to humans, at least not if “universally compelling” is meant descriptively rather than normatively. I don’t take moral realism to be a psychological thesis about what particular types of intelligences actually find compelling; I take it to be the claim that there are moral obligations and that certain types of agents should adhere to them (all other things being equal), irrespective of their particular desire sets and whether or not they feel any psychological pressure to adhere to these obligations. This is a normative claim, not a descriptive one.
What? Moral realism (in the philosophy literature) is about whether moral statements have truth values, that’s it.
When I said universally compelling, I meant universally. To all agents, not just humans. Or any large class. For any true statement, you can probably expect to find a surprisingly large number of agents who just don’t care about it.
Whether “don’t murder” (or rather, “murder is bad” since commands don’t have truth values, and are even less likely to be generally compelling) is compelling to all humans is a question for psychology. As it happens, given the existence of serial killers and sociopaths, probably the answer is no, it isn’t. Though I would hope it to be compelling to most.
I have shown you two true but non-universally-compelling arguments. Surely the difference must be clear now.
This is incorrect, in my experience. Although “moral realism” is a notoriously slippery phrase and gets used in many subtly different ways, I think most philosophers engaged in the moral realism vs. anti-realism debate aren’t merely debating whether moral statements have truth values. The position you’re describing is usually labeled “moral cognitivism”.
Anyway, I suspect you mis-spoke here, and intended to say that moral realists claim that (certain) moral statements are true, rather than just that they have truth values (“false” is a truth value, after all). But I don’t think that modification captures the tenor of the debate either. Moral realists are usually defending a whole suite of theses—not just that some moral statements are true, but that they are true objectively and that certain sorts of agents are under some sort of obligation to adhere to them.
I think you guys should taboo “moral realism”. I understand that it’s important to get the terminology right, but IMO debates about nothing but terminology have little value.
Err, right, yes, that’s what I meant. Error theorists do of course also claim that moral statements have truth values.
True enough, though I guess I’d prefer to talk about a single well-specified claim than a “usually” cluster in philosopher-space.
So, a philosopher who says:
is not a moral realist? Because that philosopher does not seem to be a subjectivist, an error theorist, or non-cognitivist.
If that philosopher believes that statements like “murder is wrong” are true, then they are indeed a realist. Did I say something that looked like I would disagree?
You guys are talking past each other, because you mean something different by ‘compelling’. I think Tim means that X is compelling to all human beings if any human being will accept X under ideal epistemic circumstances. You seem to take ‘X is universally compelling’ to mean that all human beings already do accept X, or would on a first hearing.
Would agree that all human beings would accept all true statements under ideal epistemic circumstances (i.e. having heard all the arguments, seen all the evidence, in the best state of mind)?
I guess I must clarify. When I say ‘compelling’ here I am really talking mainly about motivational compellingness. Saying “if you drink-drive, you could kill someone!” to a human is generally, motivationally compelling as an argument for not drink-driving: because humans don’t like killing people, a human will decide not to drink-drive (one in a rational state of mind, anyway).
This is distinct from accepting statements as true or false! Any rational agent, give or take a few, will presumably believe you about the causal relationship between drink-driving and manslaughter once presented with sufficient evidence. But it is a tiny subset of these who will change their decisions on this basis. A mind that doesn’t care whether it kills people will see this information as an irrelevant curiosity.
Having looked over that sequence, I haven’t found any proof that moral realism (on either definition) or moral relativism is false. Could you point me more specifically to what you have in mind (or just put the argument in your own words, if you have the time)?
No Universally Compelling Arguments is the argument against universal compellingness, as the name suggests.
Inseparably Right; or Joy in the Merely Good gives part of the argument that humans should be able to agree on ethical values. Another substantial part is in Moral Error and Moral Disagreement.
Thanks!
Edit: (Sigh), I appreciate the link, but I can’t make heads or tails of ‘No Universally Compelling Arguments’. I speak from ignorance as to the meaning of the article, but I can’t seem to identify the premises of the argument.
The central point is a bit buried.
So, there’s some sort of assumption as to what minds are:
and an assumption that a suitably diverse set of minds can be described in less than a trillion bits. Presumably the reason for that upper bound is because there are a few Fermi estimates that the information content of a human brain is in the neighborhood of one trillion bits.
Of course, if you restrict the set of minds to those with special properties (e.g., human minds), then you might find universally compelling arguments on that basis:
From which we get Coherent Extrapolated Volition and friends.
This doesn’t seem true to me, at least not as a general rule. For example, given every terrestrial DNA sequence describable in a trillion bits or less, it is not the case that every generalization of the form ‘s:X(s)’ has two to the trillionth chances to be false (e.g. ‘have more than one base pair’, ‘involve hydrogen’ etc.). Given that this doesn’t hold true of many other things, is this supposed to be a special fact about minds? Even then, it would seem odd to say that while all generalizations of the form m:X(m) have two to the trillionth chances to be false, nevertheless the generalization ‘for all minds, a generalization of the form m:X(m) has two to the trillionth chances to be false’ (which does seem to be of the form m:X(m)) is somehow more likely.
Also, doesn’t this inference imply that ‘being convinced by an argument’ is a bit that can flip on or off independently of any others? Eliezer doesn’t think that’s true, and I can’t imagine why he would think his (hypothetical) interlocutor would accept it.
It’s not a proof, no, but it seems plausible.
I mean to say, I think the argument is something of a paradox:
The claim the argument purports to defeat is something like this: for all minds, A is convincing. Lets call this m:A(m).
The argument goes like this: for all minds (at or under a trillion bits etc.), a generalization of the form m:X(m) has a one in two to the trillionth chance of being true for each mind. Call this m:U(m), if you grant me that this claim has the form m:X(m).
If we infer from m:U(m) that any claim of the form m:X(m) is unlikely to be true, then to whatever extent I am persuaded that m:A(m) is unlikely to be true, to that extent I ought to be persuaded that m:U(m) is unlikely to be true. You cannot accept the argument, because accepting it as decisive entails accepting decisive reasons for rejecting it.
The argument seems to be fixable at this stage, since there’s a lot of room to generate significant distinctions between m:A(m) and m:U(m). If you were pressed to defend it (presuming you still wish to be generous with your time) how would you fix this? Or am I getting something very wrong?
That’s not what it says; compare the emphasis in both quotes.
Sorry, I may have misunderstood and presumed that ‘two to the trillionth chances to be false’ meant ‘one in two to the trillionth chances to be true’. That may be wrong, but it doesn’t affect my argument at all: EY’s argument for the implausibility of m:A(m) is that claims of the form m:X(m) are all implausible. His argument to the effect that all claims of the form m:X(m) are implausible is itself a claim of the form m:X(m).
“Rational” is broader than “human” and narrower than “physically possible”.
Do you really mean to say that there are physically possible minds that are not rational? In virtue of what are they ‘minds’ then?
Yes. There are irrational people, and they still have minds.
Ah, I think I just misunderstood which sense of ‘rational’ you intended.
Haven’t you met another human?
Sorry, I was speaking ambiguously. I mean’t ‘rational’ not in the normative sense that distinguishes good agents from bad ones, but ‘rational’ in the broader, descriptive sense that distinguishes anything capable of responding to reasons (even terrible or false ones) from something that isn’t. I assumed that was the sense of ‘rational’ Prawn was using, but that may have been wrong.
Irrelevant. I am talking about rational minds, he is talking about physically possible ones.
As noted at the time
UFAI sounds like a counterexample, but I’m not interested in arguing with you about it. I only responded because someone asked for a shortcut in the metaethics sequence.
I have essentially being arguing against a strong likelihood of UFAI, so that would be more like gainsaying.
Congratulations on being able to discern an overall message to EY’s metaethical disquisitions. I never could.
Can you explain what you could see which would suggest to you a greater level of understanding than is prevalent among moral philosophers?
Also, moral philosophers mostly regard the question as open in the sense that some of them think that it’s clearly resolved in favor on non-realism, and some philosophers are just not getting it, or that it’s clearly resolved in favor of realism, and some philosophers are just not getting it. Most philosophers are not of the opinion that it could turn out either way and we just don’t know yet.
What I am seeing is
much-repeated confusions—the Standard Muddle
*appeals to LW doctrines which aren’t well-founded or well respected outside LW.
In I knew exactly what superior insight into the problem was, I would write it up and become famous. Insight doesn’t work like that; you don’t know it in advance, you get an “Aha” when you see it.
If people can’t agree on how a question is closed, it’s open.
Can you explain what these confusions are, and why they’re confused?
In my time studying philosophy, I observed a lot of confusions which are largely dispensed with on Less Wrong. Luke wrote a series of posts on this. This is one of the primary reasons I bothered sticking around in the community.
A question can still be “open” in that sense when all the information necessary for a rational person to make a definite judgment is available.
Eg.
You are trying to impose your morality/
I can think of one model of moral realism, and it doesn’t work, so I will ditch the whole thing.
LW doesn’t even claim to have more than about two “dissolutions”. There are probably hundreds of outstanding philosophical problems. Whence the “largely”
Which were shot down by philosophers.
Then it can only be open in the opinions of the irrational. So basically you are saying the experts are incompetent.
In what respect?
This certainly doesn’t describe my reasoning on the matter, and I doubt it describes many others’ here either.
The way I consider the issue, if I try to work out how the universe works from the ground up, I cannot see any way that moral realism would enter into it, whereas I can easily see how value systems would, so I regard assigning non-negligible probability to moral realism as privileging the hypothesis until I find some compelling evidence to support it, which, having spent a substantial amount of time studying moral philosophy, I have not yet found.
I gave up my study of philosophy because I found such confusions so pervasive. Many “outstanding” philosophical problems can be discarded because they rest on other philosophical problems which can themselves be discarded.
Can you give any examples of such, where you think that the philosophers in question addressed legitimate errors?
Yes. I am willing to assert that while there are some competent philosophers, many philosophical disagreements exist only because of incompetent “experts” perpetuating them. This is the conclusion that my experience with the field has wrought.
I mentioned them because they both came up recently
I have no idea what you mean by that. I don’t think value systems don’t come into it, I just think they are not isolated from rationality. And I am sceptical that you could predict any higher-level phenomenon from “the ground up”, whether its morality or mortgages.
Where is it proven they can be discarded?
All of them.
Are you aware that that is basically what every crank says about some other field?
Presumably, if I’m to treat as meaningful evidence about Desrtopa’s crankiness the fact that cranks make statements similar to Desrtopa, I should first confirm that non-cranks don’t make similar statements.
It seems likely to me that for every person P, there exists some field F such that P believes many aspects of F exist only because of incompetent “experts” perpetuating them. (Consider cases like F=astrology, F=phrenology, F=supply-side economics, F= feminism, etc.) And that this is true whether P is a crank or a non-crank.
So it seems this line of reasoning depends on some set F2 of fields such that P believes this of F in F2 only if P is a crank.
I understand that you’re asserting implicitly that moral philosophy is a field in F2, but this seems to be precisely what Desrtopa is disputing.
Could we reasonably say that an F is in F2 if most of the institutional participants in that F are intelligent, well-educated people? This leaves room for cranks who are right to object to F, of course.
So, just to pick an example, IIRC Dan Dennett believes the philosophical study of consciousness (qualia, etc.) is fundamentally confused in more or less the same way Desrtopa claims of the philosophical study of ethics is.
So under this formulation, if most of the institutional participants in the philosophical study of consciousness are intelligent, well-educated people, Dan Dennet is a crank?
No, I don’t think we can reasonably say that. Dan Dennet might be a crank, but it takes more than that argument to demonstrate the fact.
Good point. So how about this: someone is a crank if they object to F, where F is in F2 (by my above standard), and the reasons they have for objecting to F are not recognized as sound by a proportionate number of intelligent and well educated people.
(shrug) I suppose that works well enough, for some values of “proportionate.”
Mostly I consider this a special case of the basic “who do I trust?” social problem, applied to academic disciplines, and I don’t have any real problem saying about an academic discipline “this discipline is fundamentally confused, and the odds of work in it contributing anything valuable to the world is slim.”
Of course, as Prawn has pointed out a few times, there’s also the question of where we draw the lines around a discipline, but I mostly consider that an orthogonal question to how we evaluate the discipline.
I think this question is moot in the case of philosophy in general then; I think any philosopher worth their shirt should tell you that trust is a wholly inappropriate attitude toward philosophers, philosophical institutions and philosophical traditions.
Not in the sense I meant it.
If a philosopher makes a claim that seems on the surface to be false or incoherent, I have to decide whether to devote the additional effort to evaluating it to confirm or deny that initial judgment. One of the factors that will feed into that decision will be my estimate of the prior probability that they are saying something false or incoherent.
If I should refer to that using a word other than “trust”, that’s fine, tell me what word will refer to that to you and I’ll try to use it instead.
No, that describes what I’m talking about, so long as by trust you mean ‘a reason to hear out an argument that makes reference to the credibility of a field or its professionals’, rather than just ‘a reason to hear out an argument’. If the former, then I do think this is an inappropriate attitude toward philosophy. One reason for this is that such trust seems to depend on having a good standard for the success of a field independently of hearing out an argument. I can trust physicists because they make such good predictions, and because their work leads to such powerful technological advances. I don’t need to be a physicist to observe that. I don’t think philosophy has anything like that to speak for it. The only standards of success are the arguments themselves, and you can only evaluate them by just going ahead and doing some philosophy.
You can find trust in an institution independently of such standards by watching to see whether people you think are otherwise credible take it seriously. That will of course work with philosophy too, but if you trust Tom to be able to judge whether or not a philosophical claim is worth pursuing (and if I’m right about the above), then Tom can only be trustworthy in this regard because he has been doing philosophy (i.e. engaging with the argument). This could get you through the door on some particular philosophical claim, but not into philosophy generally.
I mean neither, I mean ‘a reason to devote time and resources to evaluating the evidence for and against a position.’ As you say, I can only evaluate a philosophical argument by ‘going ahead and doing some philosophy,’ (for a sufficiently broad understanding of ‘philosophy’), but my willingness to do, say, 20 hours of philosophy in order to evaluate Philosopher Sam’s position is going to depend on, among other things, my estimate of the prior probability that Sam is saying something false or incoherent. The likelier I think that is, the less willing I am to spend those 20 hours.
That’s fine, that’s not different from ‘hearing out an argument’ in any way important to my point (unless I’m missing something).
EDIT: Sorry, if you don’t want to include ‘that makes some reference to the credibility...etc.’ (or something like that) in what you mean by ‘trust’ then you should use a different term. Curiosity, or money, or romantic interest would all be reasons to devote time...etc. and clearly none of those are rightly called ‘trust’.
What do you have in mind as the basis for such a prior? Can you give me an example?
Point taken about other reasons to devote resources other than trust. I think we’re good here.
Re: example… I don’t mean anything deeply clever. E.g., if the last ten superficially-implausible ideas Sam espoused were false or incoherent, my priors for it will be higher than if the last ten such ideas were counterintuitive and brilliant.
Hm. I can’t argue with that, and I suppose it’s trivial to extend that to ‘if the last ten superficially-implausible ideas philosophy professors/books/etc. espoused were false or incoherent...’. So, okay, trust is an appropriate (because necessary) attitude toward philosophers and philosophical institutions. I think it’s right to say that philosophy doesn’t have external indicators in the way physics or medicine does, but the importance of that point seems diminished.
Dennett only thinks the idea of qualia is confused. He has no problem with his own books on consciousness.
No. He isn’t dismissing a whole academic subject, or a sub-field. Just one idea.
What is Dennett’s account for why philosophers of consciousness other than himself continue to think that a dismissable idea like qualia is worth continuing to discuss, even though he considers it closed?
Desrtopa doesn’t think moral philosophy is uniformly nonsense, since Desrtopa thinks one of its well known claims, moral relativism, is true.
While going on tangents is a common and expected occurrence, each such tangent has a chance of steering/commandeering the original conversation. LW has a tendency of going meta too much, when actual object level discourse would have a higher content value.
While you were practically invited to indulge in the death-by-meta with the hook of “Are you aware that that is basically what every crank says about some other field?”, we should be aware when leaving the object-level debating, and the consequences thereof. Especially since the lure can be strong:
When sufficiently meta, object-level disagreements may fizzle into cosmic/abstract insignificance, allowing for a peaceful pseudo-resolution, which ultimately just protects that which should be destroyed by the truth from being destroyed.
Such lures may be interpreted similarly to ad hominems: The latter try to drown out object-level disagreements by flinging shit until everyone’s dirty, the former zoom out until everyone’s dizzy floating in space, with vertigo. Same result to the actual debate. It’s an effective device, and one usually embraced by someone who feels like object-level arguments no longer serve his/her goals.
Ironically, this very comment goes meta lamenting going meta.
I mean that value systems are a function of physically existing things, the way a 747 is a function of physically existing things, but we have no evidence suggesting that objective morality is an existing thing. We have standards by which we judge beauty, and we project those values onto the world, but the standards are in us, not outside of us. We can see, in reductionist terms, how the existence of ethical systems within beings, which would feel from the inside like the existence of an objective morality, would come about.
Create a reasoning engine that doesn’t have those ethical systems built into it, and it would have no reason to care about them.
You can’t build a tower on empty air. If a debate has been going on for hundreds of years, stretching back to an argument which rests on “this defies our moral intuitions, therefore it’s wrong,” and that was never addressed with “moral intuitions don’t work that way,” then the debate has failed to progress in a meaningful direction, much as a debate over whether a tree falling in an empty forest makes a sound has if nobody bothers to dissolve the question.
That’s not an example. Please provide an actual one.
Sure, but it’s also what philosophers say about each other, all the time. Wittgenstein condemned practically all his predecessors and peers as incompetent, and declared that he had solved nearly the entirety of philosophy. Philosophy as a field is full of people banging their heads on a wall at all those other idiots who just don’t get it. “Most philosophers are incompetent, except for the ones who’re sensible enough to see things my way,” is a perfectly ordinary perspective among philosophers.
But I wans’t saying that. I am arguing that moral claims truth values, that aren;t indexed to individuals or socieities. That epistemic claim can be justified by appeal to an ontoogy including Moral Objects, but that is not how I am justifying it: my argument is based on rationality, as I have said many times.
We have standards by which we jusdge the truth values of mathematical claims, and they are inside us too, and that doens’t stop mathematics being objective. Relativism requires that truthvalues are indexed to us, that there is one truth for me and another for thee. Being located in us, or being operated by us are not sufficient criteria for being indexed to us.
We can see, in reductionistic terms, how the entities could converge on a unform set of truth values. There is nothing non reductionist about anything I have said. Reductionsm does not force one answer to metaethics.
Provide evidence that ethics is a whole separate modue, and not part of general reasoning ability.
Please explain why moral intuitions don’t work that way.
Please provide some foundations for somethng that aren;t unjustofied by anything more foundationa.
You can select one at random. obviously.
No, philosophers don’t regularly accuse each other of being incpompetent..just of being wrong. There’s a difference.
You are inferring a lot from one example.
Nope.
I don’t understand, can you rephrase this?
The standards by which we judge the truth of mathematical claims are not just inside us. One object plus another object will continue to equal two objects whether or not there are any living beings to make that judgment. Math is not something we’ve created within ourselves, but something we’ve discovered and observed.
If our mathematical models ever stop being able to predict in advance the behavior of the universe, then we will have rather more reason to doubt that the math inside us is different from the math outside of us.
What evidence do we have that this is the case for morality?
My assertion is that, if we judge ethics as a rational system, innate values are among the axioms that the system is predicated on. You cannot prove the axioms of a system within that system, and an ethical system predicated on premises like “happiness is good” will not itself be able to prove the goodness of happiness.
While we could suppose that the axioms which our ethical systems are predicated on are objectively true, we have considerable reason to believe that we would have developed these axioms for adaptive reasons, even if there were no sense in which objective moral axioms exist, and we do not have evidence which suggests that objective, independently existing true moral axioms do exist.
People can be induced to strongly support opposing responses to the same moral dilemma, just by rephrasing it differently to trigger different heuristics. Our moral intuitions are incoherent.
I don’t think I understand this, can you rephrase it?
I do not recall any creditable attempts, which places me in a disadvantaged position with respect to locating them. You’re the one claiming that they’re there at all, that’s why I’m asking you to do it.
Philosophers don’t usually accuse each other of being incompetent in their publications, because it’s not conducive to getting other philosophers to regard their arguments dispassionately, and that sort of open accusation is generally frowned upon in academic circles whether one believes it or not. They do regularly accuse each other of being comprehensively wrong for their entire careers. In my personal conversations with philosophers (and I never considered myself to have really taken a class, or attended a lecture by a visitor, if I didn’t speak with the person teaching it on a personal basis to probe their thoughts beyond the curriculum,) I observed a whole lot of frustration with philosophers who they think just don’t get their arguments. It’s unsurprising that people would tend to become so frustrated participating in a field that basically amounts to long running arguments extended over decades or centuries. Imagine the conversation we’re having now going on for eighty years, and neither of us has changed our minds. If you didn’t find my arguments convincing, and I hadn’t budged in all that time, don’t you’d think you’d start to suspect that I was particularly thick?
I’m using an example illustrative of my experience.
Sounds to me like PrawnOfFate is saying that any sufficiently rational cognitive system will converge on a certain set of ethical goals as a consequence of its structure, i.e. that (human-style) ethics is a property that reliably emerges in anything capable of reason.
I’d say the existence of sociopathy among humans provides a pretty good counterargument to this (sociopaths can be pretty good at accomplishing their goals, so the pathology doesn’t seem to be indicative of a flawed rationality), but at least the argument doesn’t rely on counting fundamental particles of morality or something.
I would say so also, but PrawnOfFate has already argued that sociopaths are subject to additional egocentric bias relative to normal people and thereby less rational. It seems to me that he’s implicitly judging rationality by how well it leads to a particular body of ethics he already accepts, rather than how well it optimizes for potentially arbitrary values.
Well, I’m not a psychologist, but if someone asked me to name a pathology marked by unusual egocentric bias I’d point to NPD, not sociopathy.
That brings up some interesting questions concerning how we define rationality, though. Pathologies in psychology are defined in terms of interference with daily life, and the personality disorder spectrum in particular usually implies problems interacting with people or societies. That could imply either irreconcilable values or specific flaws in reasoning, but only the latter is irrational in the sense we usually use around here. Unfortunately, people are cognitively messy enough that the two are pretty hard to distinguish, particularly since so many human goals involve interaction with other people.
In any case, this might be a good time to taboo “rational”.
Since no claim has a probability of 1.0, I only need to argue that a clear majority of rational minds converge.
How do we judge claims about transfinite numbers?
Mathematics isn’t physics. Mathematicians prove theorems from axioms, not from experiments.
Not necessarily. Eg, for utilitarians, values are just facts that are plugged into the metaethics to get concrete actions.
Metaethical systems usually have axioms like “Maximising utility is good”.
I am not sure what you mean by “exist” here. Claims are objectively true if most rational minds converge on them. That doesn’t require Objective Truth to float about in space here.
Does that mean we can;t use moral intuitions at all, or that they must be used with caution?
Philosphers talk about intuitions, because that is the term for something foundational that seems true, but can’t be justified by anything more foundational. LessWrongians don’t like intuitions, but don’t see to be able to explain how to manage without them.
Did you post any comments explaining to the professional philosophers where they had gone wrong?
I don;’t see the problem. Philosophical competence is largely about understanding the problem.
Yes, but the fact that the universe itself seems to adhere to the logical systems by which we construct mathematics gives credence to the idea that the logical systems are fundamental, something we’ve discovered rather than producing. We judge claims about nonobserved mathematical constructs like transfinites according to those systems,
But utility is a function of values. A paperclipper will produce utility according to different values than a human.
Why would most rational minds converge on values? Most human minds converge on some values, but we share almost all our evolutionary history and brain structure. The fact that most humans converge on certain values is no more indicative of rational minds in general doing so than the fact that most humans have two hands is indicative of most possible intelligent species converging on having two hands.
It means we should be aware of what our intuitions are and what they’ve developed to be good for. Intuitions are evolved heuristics, not a priori truth generators.
It seems like you’re equating intuitions with axioms here. We can (and should) recognize that our intuitions are frequently unhelpful at guiding us to he truth, without throwing out all axioms.
If I did, I don’t remember them. I may have, I may have felt someone else adequately addressed them, I may not have felt it was worth the bother.
It seems to me that you’re trying to foist onto me the effort of locating something which you were the one to testify was there in the first place.
And philosophers frequently fall into the pattern of believing that other philosophers disagree with each other due to failure to understand the problems they’re dealing with.
In any case, I reject the notion that dismissing large contingents of philosophers as lacking in competence is a valuable piece of evidence with respect t crankishness, and if you want to convince me that I am taking a crankish attitude, you’ll need to offer some other evidence.
But claims about transfinities don’t correspond directly to any object. Maths is “spun off” from other facts, on your view. So, by analogy, moral realism could be “spun off” without needing any Form of the Good to correspond to goodness.
You seem to be assumig that morality is about individual behaviour. A moral realist system like utiitarianism operates at the group level, and woud take paperclipper values into account along with all others. Utilitarianism doens’t care what values are, it just sums or averages them.
Or perhaps you are making the objection that an entity woud need moral values to care about the preferences of others in the first place. That is addressed by, another kind of realism, the rationality-based kind, which starts from noting that rational agents have to have some value in common, because they are all rational.
a) they don’t have to converge on preferences, since thing like utilitariansim are preference-neutral.
b) they already have to some extent because they are rational
I was talking about rational minds converging on the moral claims, not on values.. Rational minds can converge on “maximise group utility” whilst what is utilitous varies considerably.
Axioms are formal statements, intuitions are gut feelings tha are often used to justify axioms.
There is another sense of “intuition” where someone feels that it’s going to rain tomorrow or something. They’re not the foundational kind.
So do they call for them to be fired?
Spun off from what, and how?
Speaking as a utilitarian, yes, utilitarianism does care about what values are. If I value paperclips, I assign utility to paperclips, if I don’t, I don’t.
Why does their being rational demand that they have values in common? Being rational means that they necessarily share a common process, namely rationality, but that process can be used to optimize many different, mutually contradictory things. Why should their values converge?
So what if a paperclipper arrives at “maximize group utility,” and the only relevant member of the group which shares its conception of utility is itself, and its only basis for measuring utility is paperclips? The fact that it shares the principle of maximizing utility doesn’t demand any overlap of end-goal with other utility maximizers.
But, as I’ve pointed out previously, intuitions are often unhelpful, or even actively misleading, with respect to locating the truth.
If our axioms are grounded in our intuitions, then entities which don’t share our intuitions will not share our axioms.
No, but neither do I, so I don’t see why that’s relevant.
Designating PrawnOfFate a probable troll or sockpuppet. Suggest terminating discussion.
Request accepted, I’m not sure if he’s being deliberately obtuse, but I think this discussion probably would have borne fruit earlier if it were going to. I too often have difficulty stepping away from a discussion as soon as I think it’s unlikely to be a productive use of my time.
What is your basis for the designation ? I am not arguing with your suggestion (I was leaning in the same direction myself), I’m just genuinely curious. In other words, why do you believe that PrawnOfFate is a troll, and not someone who is genuinely confused ?
Combined behavior in other threads. Check the profile.
“Troll” is a somewhat fuzzy label. Sometimes when I am wanting to be precise or polite and avoid any hint of Fundamental Attribution Error I will replace it with the rather clumsy or verbose “person who is exhibiting a pattern of behaviour which should not be fed”. The difference between “Person who gets satisfaction from causing disruption” and “Person who is genuinely confused and is displaying an obnoxiously disruptive social attitude” is largely irrelevant (particularly when one has their Hansonian hat on).
If there was a word in popular use that meant “person likely to be disruptive and who should not be fed” that didn’t make any assumptions or implications of the intent of the accused then that word would be preferable.
I am not sure I can expalin that succintly at the moment. It is also hard to summarise how you get from counting apples to transfinite numbers.
Rationality is not an automatic process, it is skill that has to be learnt and consciously applied. Individuals will only be rational if their values prompt them to. And rationality itself implies valuing certain things (lack of bias, non arbitrariness).
Utilitarians want to maximise the utiity of their groups, not their own utility. They don;t have to believe the utlity of others is utilitous to them, they just need to feed facts about group utility into an aggregation function. And, using the same facts and same function, different utilitarians will converge. That’s kind of the point.
Compared to what? Remember, I am talking about foundational intuitions, the kind at the bottom of the stack. The empirical method of locating the truth rests on the intuition that the senses reveal a real external world. Which I share. But what proves it? That’s the foundational issue.
The question of moral realism is AFAICT orthogonal to the Orthogonality Thesis.
A lot of people here would seem to disagree, since I keep hearing the objection that ethics is all about values, and values are nothing to do with rationality.
Could you make the connection to what I said more explicit please? Thanks!
″ values are nothing to do with rationality”=the Orthogonality Thesis, so it’s a step in the argument.
It feels to me like the Orthogonality Thesis is a fairly precise statement, and moral anti-realism is a harder to make precise but at least well understood statement, and “values are nothing to do with rationality” is something rather vague that could mean either of those things or something else.
You can change that line, but it will result in you optimizing for something other than paperclips, resulting in less paperclips.