In questions like this, it’s very important to keep in mind the difference between state of knowledge about preference (which corresponds to explicitly endorsed moral principles, such as “slavery bad!”; this clearly changed), and preference itself (which we mostly don’t understand, even if our minds define what it is). Since FAI needs to operate according to preference, and not out state of knowledge about preference, any changes in our state of knowledge (moral principles) is irrelevant, except for where they have a chance of reflecting changes in the actual preference.
So the idea is that 21st century American and caveman Gork from 40000 BC probably have very similar preference, because they have very similar cognitive architecture, even though clearly they have different explicitly endorsed moral principles. This property is a “sanity check” on a method of defining preference, not an explicit requirement.
In other words, finding similar preferences in people from different eras is about consistency expected between different maps of the same territory, not about adding a rule that demands consistency from the maps of the territory, even if the changes thus introduced aren’t based in fact.
So the idea is that 21st century American and caveman Gork from 40000 BC probably have very similar preference, because they have very similar cognitive architecture
If something like Julian Jaynes’ notion of a recent historical origin of consciousness from a prior state of bicameralism is true, we might be in trouble there.
More generally, you need to argue that culture is a negligible part of cognitive architecture; I strongly doubt that is the case.
What do you believe about these immutable, universal preferences?
Here are some potential problems I see with these theorized builtin preferences, since we don’t know what they actually are yet:
They may conflict with our consciously held morals or desires: e.g., they may not include compassion or altruism for anyone we never met face to face. They may even conflict with both our own morals and with Gork’s morals, at the same time. In that case, why shouldn’t we privilege our conscious desires?
They may not be very interesting: just “want to have food and comfort, sex, social status, children”. They wouldn’t include many things we consciously want because those thing evolved out of subverted button-pushing or as hyperstimuli—such as scientific research. Why should we choose to discard such values just because they aren’t embodied in our hardware?
They may be susceptible to many cognitive traps and dead-ends (e.g. wireheading) that we can only work around using conscious thought and our consciously held values.
They may include values or desires we would consciously prefer to eradicate entirely, such as a drive for fighting or for making war. If you thought that 1) most people in history enjoyed and desired war, and 2) this was due to a feature of their builtin cognitive architecture that said “when in situation X, conquer your neighbors”—would you want to include this in the CEV?
They may include features that optimize inclusive genetic fitness at the expense of the user, e.g. causing emotional suffering as negative feedback.
They may include values or desires we would consciously prefer to eradicate entirely, such as a drive for fighting or for making war. If you thought that 1) most people in history enjoyed and desired war, and 2) this was due to a feature of their builtin cognitive architecture that said “when in situation X, conquer your neighbors”—would you want to include this in the CEV?
CEV is supposed to incorporate not only the things you want (or enjoy), but also the things you want to want (or don’t want to enjoy, in this case).
As Vladimir Nesov said, there are builtin preferences (which CEV takes into account), and then there are our conscious desires or “state of knowledge about preference”. The two may be in conflict in some cases.
How do you know that CEV won’t include something that all the humans alive today, on the conscious level, would find hateful?
If you’re saying actual human preference is determined by human biology and brain architecture, but mostly independent from brain content, this is a very new claim that I don’t remember hearing ever before. You’ll need pretty strong arguments to defend it. I’d bet at about 80% odds that Eliezer would disagree with it.
If you’re saying actual human preference is determined by human biology and brain architecture, but mostly independent from brain content, this is a very new claim that I don’t remember hearing ever before.
Hmm, I think I’ve said this many times already. Of course beliefs are bound to change preference to some extent, but shouldn’t be allowed to do this too much. On reflection, you wouldn’t want the decisions (to obtain certain beliefs) of your stupid human brain with all its biases that you already know not to endorse, to determine what should be done with the universe.
Only where such decisions manage to overcome this principle, will there be change, and I can’t even think of a specific example of when that should happen. Generally, you can’t trust yourself. The fact that you believe that X is better than Y is not in itself a reason to believe that X is better than Y, although you might believe that X is better than Y because it is (because of a valid reason for X being better than Y, which your belief in X being better than Y isn’t).
So when beliefs do change your preference, it probably won’t be in accordance with beliefs about preference.
On reflection, you wouldn’t want the decisions (to obtain certain beliefs) of your stupid human brain with all its biases that you already know not to endorse, to determine what should be done with the universe.
As opposed our biology and brain architecture, which were designed by the blind idiot god.
But don’t our biological preferences imply pressing pleasure buttons? Isn’t it just for our cultural/learnt preferences (brain content) that we assign low utility to drug induced happiness and push-button pleasure?
In questions like this, it’s very important to keep in mind the difference between state of knowledge about preference (which corresponds to explicitly endorsed moral principles, such as “slavery bad!”; this clearly changed), and preference itself (which we mostly don’t understand, even if our minds define what it is). Since FAI needs to operate according to preference, and not out state of knowledge about preference, any changes in our state of knowledge (moral principles) is irrelevant, except for where they have a chance of reflecting changes in the actual preference.
So the idea is that 21st century American and caveman Gork from 40000 BC probably have very similar preference, because they have very similar cognitive architecture, even though clearly they have different explicitly endorsed moral principles. This property is a “sanity check” on a method of defining preference, not an explicit requirement.
In other words, finding similar preferences in people from different eras is about consistency expected between different maps of the same territory, not about adding a rule that demands consistency from the maps of the territory, even if the changes thus introduced aren’t based in fact.
If something like Julian Jaynes’ notion of a recent historical origin of consciousness from a prior state of bicameralism is true, we might be in trouble there.
More generally, you need to argue that culture is a negligible part of cognitive architecture; I strongly doubt that is the case.
What do you believe about these immutable, universal preferences?
Here are some potential problems I see with these theorized builtin preferences, since we don’t know what they actually are yet:
They may conflict with our consciously held morals or desires: e.g., they may not include compassion or altruism for anyone we never met face to face. They may even conflict with both our own morals and with Gork’s morals, at the same time. In that case, why shouldn’t we privilege our conscious desires?
They may not be very interesting: just “want to have food and comfort, sex, social status, children”. They wouldn’t include many things we consciously want because those thing evolved out of subverted button-pushing or as hyperstimuli—such as scientific research. Why should we choose to discard such values just because they aren’t embodied in our hardware?
They may be susceptible to many cognitive traps and dead-ends (e.g. wireheading) that we can only work around using conscious thought and our consciously held values.
They may include values or desires we would consciously prefer to eradicate entirely, such as a drive for fighting or for making war. If you thought that 1) most people in history enjoyed and desired war, and 2) this was due to a feature of their builtin cognitive architecture that said “when in situation X, conquer your neighbors”—would you want to include this in the CEV?
They may include features that optimize inclusive genetic fitness at the expense of the user, e.g. causing emotional suffering as negative feedback.
CEV is supposed to incorporate not only the things you want (or enjoy), but also the things you want to want (or don’t want to enjoy, in this case).
Supposed to based on what evidence?
As Vladimir Nesov said, there are builtin preferences (which CEV takes into account), and then there are our conscious desires or “state of knowledge about preference”. The two may be in conflict in some cases.
How do you know that CEV won’t include something that all the humans alive today, on the conscious level, would find hateful?
If you’re saying actual human preference is determined by human biology and brain architecture, but mostly independent from brain content, this is a very new claim that I don’t remember hearing ever before. You’ll need pretty strong arguments to defend it. I’d bet at about 80% odds that Eliezer would disagree with it.
Hmm, I think I’ve said this many times already. Of course beliefs are bound to change preference to some extent, but shouldn’t be allowed to do this too much. On reflection, you wouldn’t want the decisions (to obtain certain beliefs) of your stupid human brain with all its biases that you already know not to endorse, to determine what should be done with the universe.
Only where such decisions manage to overcome this principle, will there be change, and I can’t even think of a specific example of when that should happen. Generally, you can’t trust yourself. The fact that you believe that X is better than Y is not in itself a reason to believe that X is better than Y, although you might believe that X is better than Y because it is (because of a valid reason for X being better than Y, which your belief in X being better than Y isn’t).
So when beliefs do change your preference, it probably won’t be in accordance with beliefs about preference.
As opposed our biology and brain architecture, which were designed by the blind idiot god.
But don’t our biological preferences imply pressing pleasure buttons? Isn’t it just for our cultural/learnt preferences (brain content) that we assign low utility to drug induced happiness and push-button pleasure?