There’s a counterargument-template which roughly says “Suppose the ground-truth source of morality is X. If X says that it’s good to torture babies (not in exchange for something else valuable, just good in its own right), would you then accept that truth and spend your resources to torture babies? Does X saying it’s good actually make it good?”
Applied to the most strawmannish version of moral realism, this might say something like “Suppose the ground-truth source of morality is a set of stone tablets inscribed with rules. If one day someone finds the tablets, examines them, and notices some previously-overlooked text at the bottom saying that it’s good to torture babies, would you then accept this truth and spend your resources to torture babies? Does the tablets saying it’s good actually make it good?”
Applied to a stronger version of moral realism, it might say something like “Suppose the ground-truth source of morality is game-theoretic cooperation. If it turns out that, in our universe, we can best cooperate with most other beings by torturing babies (perhaps as a signal that we are willing to set aside our own preferences in order to cooperate), would you then accept this truth and spend your resources to torture babies? Does the math saying it’s good actually make it good?”
The point of these templated examples is not that the answer is obviously “no”. (Though “no” is definitely my answer.) A true moral realist will likely respond by saying “yes, but I do not believe that X would actually say that”. That brings us to the real argument: why does the moral realist believe this? “What do I think I know, and how do I think I know it?” What causal, physical process resulted in that belief?
(Often, the reasoning goes something like “I’m fairly confident that torturing babies is bad, therefore I’m fairly confident that the ground-truth source of morality will say it’s bad”. But then we have to ask: why are my beliefs about morality evidence for the ground-truth? What physical process entangled these two? If the ground-truth source had given the opposite answer, would I currently believe the opposite thing?)
In the strawmannish case of the stone tablets, there is pretty obviously no causal link. Humans’ care for babies’ happiness seems to have arisen for evolutionary fitness reasons; it would likely be exactly the same if the stone tablets said something different.
In the case of game-theoretic cooperation, one could argue that evolution itself is selecting according to the game-theoretic laws in question. On the other hand, thou art godshatter, and also evolution is entirely happy to select for eating other people’s babies in certain circumstances. The causal link between game-theoretic cooperation and our particular evolved preferences is unreliable at best.
At this point, one could still self-consistently declare that the ground-truth source is still correct, even if one’s own intuitions are an unreliable proxy. But I think most moral realists would update away from the position if they understood, on a gut level, just how often their preferred ground-truth source diverges from their moral intuitions. Most just haven’t really attacked the weak points of that belief. (And in fact, if they would update away upon learning that the two diverge, then they are not really moral realists, regardless of whether the two do diverge much.)
Side-note: Three Worlds Collide is a fun read, and is not-so-secretly a great thinkpiece on moral realism.
Thank you for the detailed answer! I’ll read Three Worlds Collide.
That brings us to the real argument: why does the moral realist believe this? “What do I think I know, and how do I think I know it?” What causal, physical process resulted in that belief?
I think a world full of people who are always blissed out is better than a world full of people who are always depressed or in pain. I don’t have a complete ordering over world-histories, but I am confident in this single preference, and if someone called this “objective value” or “moral truth” I wouldn’t say they are clearly wrong. In particular, if someone told me that there exists a certain class of AI systems that end up endorsing the same single preference, and that these AI systems are way less biased and more rational than humans, I would find all that plausible. (Again, compare this if you want.)
Now, why do I think this?
I am a human and I am biased by my own emotional system, but I can still try to imagine what would happen if I stopped feeling emotions. I think I would still consider the happy world more valuable than the sad world. Is this a proof that objective value is a thing? Of course not. At the same time, I can imagine also an AI system thinking: “Look, I know various facts about this world. I don’t believe in golden rules written in fire etched into the fabric of reality, or divine commands about what everyone should do, but I know there are some weird things that have conscious experiences and memory, and this seems something valuable in itself. Moreover, I don’t see other sources of value at the moment. I guess I’ll do something about it.” (Taken from this comment)
Look, I know various facts about this world. I don’t believe in golden rules written in fire etched into the fabric of reality, or divine commands about what everyone should do, but I know there are some weird things that have conscious experiences and memory, and this seems something valuable in itself. Moreover, I don’t see other sources of value at the moment. I guess I’ll do something about it.
Why would something which doesn’t already have values be looking for values? Why would conscious experiences and memory “seem valuable” to a system which does not have values already? Seems like having a “source of value” already is a prerequisite to something seeming valuable—otherwise, what would make it seem valuable?
At the very least, we have strong theoretical reasoning models (like Bayesian reasoners, or Bayesian EU maximizers), which definitely do not go looking for values to pursue, or adopt new values.
At the very least, we have strong theoretical reasoning models (like Bayesian reasoners, or Bayesian EU maximizers), which definitely do not go looking for values to pursue, or adopt new values.
This does not imply one cannot build an agent that works according to a different framework. VNM Utility maximization requires a complete ordering of preferences, and does not say anything about where the ordering comes from in the first place. (But maybe your point was just that our current models do not “look for values”)
Why would something which doesn’t already have values be looking for values? Why would conscious experiences and memory “seem valuable” to a system which does not have values already? Seems like having a “source of value” already is a prerequisite to something seeming valuable—otherwise, what would make it seem valuable?
An agent could have a pre-built routine or subagent that has a certain degree of control over what other subagents do—in a sense, it determines what are the “values” of the rest of the system. If this routine looks unbiased / rational / valueless, we have a system that considers some things as valuable (acts to pursue them) without having a pre-value, or at least the pre-value doesn’t look like something that we would consider a value.
We do have real-world examples of things which do not themselves have anything humans would typically consider values, but do determine the values of the rest of some system. Evolution determining human values is a good example: evolution does not itself care about anything, yet it produced human values. Of course, if we just evolve some system, we don’t expect it to robustly end up with Good values—e.g. the Babyeaters (from Three Worlds Collide) are a plausible outcome as well. Just because we have a value-less system which produces values, does not mean that the values produced are Good.
This example generalizes: we have some subsystem which does not itself contain anything we’d consider values. It determines the values of the rest of the system. But then, what reason do we have to expect that the values produced will be Good? The most common reason to believe such a thing is to predict that the subsystem will produce values similar to our own moral intuitions. But if that’s the case, then we’re using our own moral intuitions as the source-of-truth to begin with, which is exactly the opposite of moral realism.
To reiterate: the core issue with this setup is why we expect the value-less subsystem to produce something Good. How could we possibly know that, without using some other source-of-truth about Goodness to figure it out?
“How the physical world works” seems, to me, a plausible source-of-truth. In other words: I consider some features of the environment (e.g. consciousness) as a reason to believe that some AI systems might end up caring about a common set of things, after they’ve spent some time gathering knowledge about the world and reasoning. Our (human) moral intuitions might also be different from this set.
There’s a counterargument-template which roughly says “Suppose the ground-truth source of morality is X. If X says that it’s good to torture babies (not in exchange for something else valuable, just good in its own right), would you then accept that truth and spend your resources to torture babies? Does X saying it’s good actually make it good?”
I’m not sure if I’m able to properly articulate my thoughts on this but I’d be interested to know if it’s understandable and where it might fit. Sorry if I repeat myself.
from my perspective It’s like if you applied a similar template to verify/refute the cogito.
I know consciousness exists because I’m conscious of it. If you asked me if I’d accept the truth that I’m not conscious, supposing this were the result of the cogito, I’d consider that question incoherent.
If someone concluded that they’re not conscious, by leveraging consciousness to assess whether they’re conscious, then I could only conclude that they misunderstand consciousness.
My version of moral realism would be similar. The existence of positive and negative moral value is effectively self evident to all beings affected by such values.
To me, saying:
“what if the ground truth of morality is that (all else equal) an instance of suffering is preferable to it’s absence.”
Is like saying:
“what if being conscious of one’s own experience isn’t necessarily evidence for consciousness.”
I actually don’t think this is a statement of moral realism; I think it’s a statement of moral nonrealism. Roughly speaking, you’re saying that the ground-truth source of values is the self-evidence of those values to agents holding them. If some other agents hold some other values, then those other values can presumably seem just as self-evident to those other agents. (And of course we humans would then say that those other agents are immoral.)
This all sounds functionally-identical to moral nonrealism. In particular, it gives us no reason at all to expect some alien intelligence or AI to converge to similar values to humans, and it says that an AI will have to somehow get evidence about what humans consider moral in order to learn morality.
I appreciate your input, these are my first two comments here so apologies if i’m out of line at all.
>Roughly speaking, you’re saying that the ground-truth source of values is the self-evidence of those values to agents holding them.
In the same way that the ground-truth proof for the existence of conscious experience comes from conscious experience. This doesn’t Imply that consciousness is any less real, even if it means that it isn’t possible for one agent to entirely assess the “realness” of another agent’s claims to be experiencing consciousness. Agents can also be mistaken about the self evident nature/scope of certain things relevant to consciousness, and other agents can justifiably reject the inherent validity of those claims, however those points don’t suggest doubting the fact that the existence of consciousness can be arrived at self evidently.
For example, someone might suggest that It is self evident that a particular course of events occurred because they have a clear memory of it happening. Obviously they’re wrong to call that self evident, and you could justifiably dismiss their level of confidence.
Similarly, I’m not suggesting that any given moral value held to be self evident should be considered as such, just that the realness of morality is arrived at self evidently.
I realise that probably makes it sound like I’m trying to rationalise attributing the awareness of moral reality to some enlightened subset who I happen to agree with, but I’m suggesting there’s a common denominator which all morally relevant agents are inherently cognizant of. I think experiencing suffering is sufficient evidence for the existence of real moral truth value.
If an alien intelligence claimed to prefer to experience suffering on net, I think it would be a faulty translation or a deception, in the same sense as if an alien intelligence claimed to exhibit a variety of consciousness that precluded experiential phenomenon.
>it says that an AI will have to somehow get evidence about what humans consider moral in order to learn morality.
Does moral realism necessarily imply that a sufficiently intelligent system can bootstrap moral knowledge without evidence derived via conscious agents? That isn’t obvious to me.
In this rebate “real” means objective which means something like independent from observers. Consciousness is dependent on you observing it and the idea that you could be conscious without observing it seems incoherent.
The moral realism position is that it’s coherent to say that there are thinks that have moral value even if there’s no observer that judges them to have moral value.
I’m suggesting there’s a common denominator which all morally relevant agents are inherently cognizant of.
This naturally raises the question of whether people who don’t agree with you are not moral agents or are somehow so confused or deceitful that they have abandoned their inherent truth. I’ve heard the second version stated seriously in my Bible-belt childhood; it didn’t impress me then. The first just seems … odd (and also raises the question of whether the non-morally-relevant will eventually outcompete the moral, leading to their extinction).
Any position claiming that everyone, deep down, agrees tends to founder on the observation that we simply don’t—or to seem utterly banal (because everyone agrees with it).
There’s a counterargument-template which roughly says “Suppose the ground-truth source of morality is X. If X says that it’s good to torture babies (not in exchange for something else valuable, just good in its own right), would you then accept that truth and spend your resources to torture babies? Does X saying it’s good actually make it good?”
Applied to the most strawmannish version of moral realism, this might say something like “Suppose the ground-truth source of morality is a set of stone tablets inscribed with rules. If one day someone finds the tablets, examines them, and notices some previously-overlooked text at the bottom saying that it’s good to torture babies, would you then accept this truth and spend your resources to torture babies? Does the tablets saying it’s good actually make it good?”
Applied to a stronger version of moral realism, it might say something like “Suppose the ground-truth source of morality is game-theoretic cooperation. If it turns out that, in our universe, we can best cooperate with most other beings by torturing babies (perhaps as a signal that we are willing to set aside our own preferences in order to cooperate), would you then accept this truth and spend your resources to torture babies? Does the math saying it’s good actually make it good?”
The point of these templated examples is not that the answer is obviously “no”. (Though “no” is definitely my answer.) A true moral realist will likely respond by saying “yes, but I do not believe that X would actually say that”. That brings us to the real argument: why does the moral realist believe this? “What do I think I know, and how do I think I know it?” What causal, physical process resulted in that belief?
(Often, the reasoning goes something like “I’m fairly confident that torturing babies is bad, therefore I’m fairly confident that the ground-truth source of morality will say it’s bad”. But then we have to ask: why are my beliefs about morality evidence for the ground-truth? What physical process entangled these two? If the ground-truth source had given the opposite answer, would I currently believe the opposite thing?)
In the strawmannish case of the stone tablets, there is pretty obviously no causal link. Humans’ care for babies’ happiness seems to have arisen for evolutionary fitness reasons; it would likely be exactly the same if the stone tablets said something different.
In the case of game-theoretic cooperation, one could argue that evolution itself is selecting according to the game-theoretic laws in question. On the other hand, thou art godshatter, and also evolution is entirely happy to select for eating other people’s babies in certain circumstances. The causal link between game-theoretic cooperation and our particular evolved preferences is unreliable at best.
At this point, one could still self-consistently declare that the ground-truth source is still correct, even if one’s own intuitions are an unreliable proxy. But I think most moral realists would update away from the position if they understood, on a gut level, just how often their preferred ground-truth source diverges from their moral intuitions. Most just haven’t really attacked the weak points of that belief. (And in fact, if they would update away upon learning that the two diverge, then they are not really moral realists, regardless of whether the two do diverge much.)
Side-note: Three Worlds Collide is a fun read, and is not-so-secretly a great thinkpiece on moral realism.
Thank you for the detailed answer! I’ll read Three Worlds Collide.
I think a world full of people who are always blissed out is better than a world full of people who are always depressed or in pain. I don’t have a complete ordering over world-histories, but I am confident in this single preference, and if someone called this “objective value” or “moral truth” I wouldn’t say they are clearly wrong. In particular, if someone told me that there exists a certain class of AI systems that end up endorsing the same single preference, and that these AI systems are way less biased and more rational than humans, I would find all that plausible. (Again, compare this if you want.)
Now, why do I think this?
I am a human and I am biased by my own emotional system, but I can still try to imagine what would happen if I stopped feeling emotions. I think I would still consider the happy world more valuable than the sad world. Is this a proof that objective value is a thing? Of course not. At the same time, I can imagine also an AI system thinking: “Look, I know various facts about this world. I don’t believe in golden rules written in fire etched into the fabric of reality, or divine commands about what everyone should do, but I know there are some weird things that have conscious experiences and memory, and this seems something valuable in itself. Moreover, I don’t see other sources of value at the moment. I guess I’ll do something about it.” (Taken from this comment)
Why would something which doesn’t already have values be looking for values? Why would conscious experiences and memory “seem valuable” to a system which does not have values already? Seems like having a “source of value” already is a prerequisite to something seeming valuable—otherwise, what would make it seem valuable?
At the very least, we have strong theoretical reasoning models (like Bayesian reasoners, or Bayesian EU maximizers), which definitely do not go looking for values to pursue, or adopt new values.
This does not imply one cannot build an agent that works according to a different framework. VNM Utility maximization requires a complete ordering of preferences, and does not say anything about where the ordering comes from in the first place.
(But maybe your point was just that our current models do not “look for values”)
An agent could have a pre-built routine or subagent that has a certain degree of control over what other subagents do—in a sense, it determines what are the “values” of the rest of the system. If this routine looks unbiased / rational / valueless, we have a system that considers some things as valuable (acts to pursue them) without having a pre-value, or at least the pre-value doesn’t look like something that we would consider a value.
We do have real-world examples of things which do not themselves have anything humans would typically consider values, but do determine the values of the rest of some system. Evolution determining human values is a good example: evolution does not itself care about anything, yet it produced human values. Of course, if we just evolve some system, we don’t expect it to robustly end up with Good values—e.g. the Babyeaters (from Three Worlds Collide) are a plausible outcome as well. Just because we have a value-less system which produces values, does not mean that the values produced are Good.
This example generalizes: we have some subsystem which does not itself contain anything we’d consider values. It determines the values of the rest of the system. But then, what reason do we have to expect that the values produced will be Good? The most common reason to believe such a thing is to predict that the subsystem will produce values similar to our own moral intuitions. But if that’s the case, then we’re using our own moral intuitions as the source-of-truth to begin with, which is exactly the opposite of moral realism.
To reiterate: the core issue with this setup is why we expect the value-less subsystem to produce something Good. How could we possibly know that, without using some other source-of-truth about Goodness to figure it out?
“How the physical world works” seems, to me, a plausible source-of-truth. In other words: I consider some features of the environment (e.g. consciousness) as a reason to believe that some AI systems might end up caring about a common set of things, after they’ve spent some time gathering knowledge about the world and reasoning. Our (human) moral intuitions might also be different from this set.
I’m not sure if I’m able to properly articulate my thoughts on this but I’d be interested to know if it’s understandable and where it might fit. Sorry if I repeat myself.
from my perspective It’s like if you applied a similar template to verify/refute the cogito.
I know consciousness exists because I’m conscious of it. If you asked me if I’d accept the truth that I’m not conscious, supposing this were the result of the cogito, I’d consider that question incoherent.
If someone concluded that they’re not conscious, by leveraging consciousness to assess whether they’re conscious, then I could only conclude that they misunderstand consciousness.
My version of moral realism would be similar. The existence of positive and negative moral value is effectively self evident to all beings affected by such values.
To me, saying: “what if the ground truth of morality is that (all else equal) an instance of suffering is preferable to it’s absence.” Is like saying: “what if being conscious of one’s own experience isn’t necessarily evidence for consciousness.”
I actually don’t think this is a statement of moral realism; I think it’s a statement of moral nonrealism. Roughly speaking, you’re saying that the ground-truth source of values is the self-evidence of those values to agents holding them. If some other agents hold some other values, then those other values can presumably seem just as self-evident to those other agents. (And of course we humans would then say that those other agents are immoral.)
This all sounds functionally-identical to moral nonrealism. In particular, it gives us no reason at all to expect some alien intelligence or AI to converge to similar values to humans, and it says that an AI will have to somehow get evidence about what humans consider moral in order to learn morality.
I appreciate your input, these are my first two comments here so apologies if i’m out of line at all.
>Roughly speaking, you’re saying that the ground-truth source of values is the self-evidence of those values to agents holding them.
In the same way that the ground-truth proof for the existence of conscious experience comes from conscious experience. This doesn’t Imply that consciousness is any less real, even if it means that it isn’t possible for one agent to entirely assess the “realness” of another agent’s claims to be experiencing consciousness. Agents can also be mistaken about the self evident nature/scope of certain things relevant to consciousness, and other agents can justifiably reject the inherent validity of those claims, however those points don’t suggest doubting the fact that the existence of consciousness can be arrived at self evidently.
For example, someone might suggest that It is self evident that a particular course of events occurred because they have a clear memory of it happening. Obviously they’re wrong to call that self evident, and you could justifiably dismiss their level of confidence.
Similarly, I’m not suggesting that any given moral value held to be self evident should be considered as such, just that the realness of morality is arrived at self evidently.
I realise that probably makes it sound like I’m trying to rationalise attributing the awareness of moral reality to some enlightened subset who I happen to agree with, but I’m suggesting there’s a common denominator which all morally relevant agents are inherently cognizant of. I think experiencing suffering is sufficient evidence for the existence of real moral truth value.
If an alien intelligence claimed to prefer to experience suffering on net, I think it would be a faulty translation or a deception, in the same sense as if an alien intelligence claimed to exhibit a variety of consciousness that precluded experiential phenomenon.
>it says that an AI will have to somehow get evidence about what humans consider moral in order to learn morality.
Does moral realism necessarily imply that a sufficiently intelligent system can bootstrap moral knowledge without evidence derived via conscious agents? That isn’t obvious to me.
In this rebate “real” means objective which means something like independent from observers. Consciousness is dependent on you observing it and the idea that you could be conscious without observing it seems incoherent.
The moral realism position is that it’s coherent to say that there are thinks that have moral value even if there’s no observer that judges them to have moral value.
I’m suggesting there’s a common denominator which all morally relevant agents are inherently cognizant of.
This naturally raises the question of whether people who don’t agree with you are not moral agents or are somehow so confused or deceitful that they have abandoned their inherent truth. I’ve heard the second version stated seriously in my Bible-belt childhood; it didn’t impress me then. The first just seems … odd (and also raises the question of whether the non-morally-relevant will eventually outcompete the moral, leading to their extinction).
Any position claiming that everyone, deep down, agrees tends to founder on the observation that we simply don’t—or to seem utterly banal (because everyone agrees with it).