Let me clarify that I don’t argue from agreement per say. I care about the underlying epistemic mechanism of agreement, that I claim to also be the mechanism of correctness. My point is that I don’t see similar epistemic mechanism in the case of morality.
Of course, emotions are verifiable states of brains. And the same goes for preferring actions that would lead to certain emotions and not others. It is a verifiable fact that you like chocolate. It is a contingent property of my brain that I care, but I don’t see what sort of argument that it is correct for me too care could even in principle be inherntly compelling.
I don’t know what passes your test of ‘in principle be an inherently compelling argument’. It’s a toy example, but here are some steps that to me seem logical / rational / coherent / right / sensible / correct:
X is a state of mind that feels bad to whatever mind experiences it (this is the starting assumption, it seems we agree that such an X exists, or at least something similar to X)
X, experienced on a large scale by many minds, is bad
Causing X on a large scale is bad
When considering what to do, I’ll discard actions that cause X, and choose other options instead.
Now, some people will object and say that there are holes in this chain of reasoning, i.e. that 2 doesn’t logically follow from 1, or 3 doesn’t follow from 2, or 4 doesn’t follow from 3. For the sake of this discussion, let’s say that you object the step from 1 to 2. Then, what about this replacement:
X is a state of mind that feels bad to whatever mind experiences it [identical to original 1]
X, experienced on a large scale by many minds, is good [replaced ‘bad’ with ‘good’]
Does this passage from 1 to 2 seems, to you (our hypothetical objector), equally logical / rational / coherent / right / sensible / correct as the original step from 1 to 2? Could I replace ‘bad’ with basically anything, and the correctness would not change at all as a result?
My point is that, to many reflecting minds, the replacement seems less logical / rational / coherent / right / sensible / correct than the original step. And this is what I care about for my research: I want an AI that reflects in a similar way, an AI to which the original steps do seem rational and sensible, while replacements like the one I gave do not.
That was good for my understanding of your position. My main problem with the whole thing though is in the use the word “bad”. I think it should be taboo at least until we establish a shared meaning.
Specifically, I think that most observers will find the first argument more logical than the second, because of a fallacy in using the word “bad”. I think that we learn that word in a way that is deeply entangled with power reward mechanism, to the point that it is mostly just a pointer to negative reward, things that we want to avoid, things that made our parents angry… In my view, the argument is than basically:
I want to avoid my suffering, and now generally person p want to avoid person p suffering. Therfore suffering is “to be avoided” in general, therefore suffering is “thing my parents will punish for”, therefore avoid creating suffering.
When written that way, it doesn’t seem more logical than is opposite.
To a kid, ‘bad things’ and ‘things my parents don’t want me to do’ overlap to a large degree. This is not true for many adults. This is probably why the step
suffering is “to be avoided” in general, therefore suffering is “thing my parents will punish for”
seems weak.
Overall, what is the intention behind your comments? Are you trying to understand my position even better, and if so, why? Are you interested in funding this kind of research; or are you looking for opportunities to change your mind; or are you trying to change my mind?
Let me clarify that I don’t argue from agreement per say. I care about the underlying epistemic mechanism of agreement, that I claim to also be the mechanism of correctness. My point is that I don’t see similar epistemic mechanism in the case of morality.
Of course, emotions are verifiable states of brains. And the same goes for preferring actions that would lead to certain emotions and not others. It is a verifiable fact that you like chocolate. It is a contingent property of my brain that I care, but I don’t see what sort of argument that it is correct for me too care could even in principle be inherntly compelling.
I don’t know what passes your test of ‘in principle be an inherently compelling argument’. It’s a toy example, but here are some steps that to me seem logical / rational / coherent / right / sensible / correct:
X is a state of mind that feels bad to whatever mind experiences it (this is the starting assumption, it seems we agree that such an X exists, or at least something similar to X)
X, experienced on a large scale by many minds, is bad
Causing X on a large scale is bad
When considering what to do, I’ll discard actions that cause X, and choose other options instead.
Now, some people will object and say that there are holes in this chain of reasoning, i.e. that 2 doesn’t logically follow from 1, or 3 doesn’t follow from 2, or 4 doesn’t follow from 3. For the sake of this discussion, let’s say that you object the step from 1 to 2. Then, what about this replacement:
X is a state of mind that feels bad to whatever mind experiences it [identical to original 1]
X, experienced on a large scale by many minds, is good [replaced ‘bad’ with ‘good’]
Does this passage from 1 to 2 seems, to you (our hypothetical objector), equally logical / rational / coherent / right / sensible / correct as the original step from 1 to 2? Could I replace ‘bad’ with basically anything, and the correctness would not change at all as a result?
My point is that, to many reflecting minds, the replacement seems less logical / rational / coherent / right / sensible / correct than the original step. And this is what I care about for my research: I want an AI that reflects in a similar way, an AI to which the original steps do seem rational and sensible, while replacements like the one I gave do not.
That was good for my understanding of your position. My main problem with the whole thing though is in the use the word “bad”. I think it should be taboo at least until we establish a shared meaning.
Specifically, I think that most observers will find the first argument more logical than the second, because of a fallacy in using the word “bad”. I think that we learn that word in a way that is deeply entangled with power reward mechanism, to the point that it is mostly just a pointer to negative reward, things that we want to avoid, things that made our parents angry… In my view, the argument is than basically:
I want to avoid my suffering, and now generally person p want to avoid person p suffering. Therfore suffering is “to be avoided” in general, therefore suffering is “thing my parents will punish for”, therefore avoid creating suffering.
When written that way, it doesn’t seem more logical than is opposite.
To a kid, ‘bad things’ and ‘things my parents don’t want me to do’ overlap to a large degree. This is not true for many adults. This is probably why the step
seems weak.
Overall, what is the intention behind your comments? Are you trying to understand my position even better, and if so, why? Are you interested in funding this kind of research; or are you looking for opportunities to change your mind; or are you trying to change my mind?
Since I became reasonably sure that I understand your position and reasoning—mostly changing it.