I don’t know much about CEV (I started to read Eliezer’s paper but I didn’t get very far), but I’m not sure it’s possible to extrapolate values like that. What if 19th-century slave owners hold white-people-are-better as a terminal value?
On the other hand, it does seem plausible that slave owner would oppose slavery if he weren’t himself a slave owner, so his CEV may indeed support racial equality. I simply don’t know enough about CEV or how to implement it to make a judgment one way or the other.
Terminal values can change with education. Saying that the coherent extrapolated volition of 19th-century slave owners would have been racist is equivalent to saying that either racism is justified by the facts, or the fundamental norms of rationality latent in 19th-century slave-owner cognition are radically unlike our contemporary fundamental norms of rationality. For instance, slave-owners don’t don’t on any deep level value consistency between their moral intuitions, or they assign zero weight to moral intuitions involving empathy.
If new experiences and rationality training couldn’t ever persuade a slave-owner to become an egalitarian, then I’m extremely confused by the fact that society has successfully eradicated the memes that restructured those slave-owners’ brains so quickly. Maybe I’m just more sanguine than most people about the possibility that new information can actually change people’s minds (including their values). Science doesn’t progress purely via the eradication of previous generations.
I’m not sure I’d agree with that framing. If an ethical feature changes with education, that’s good evidence that it’s not a terminal value, to whatever extent that it makes sense to talk about terminal values in humans. Which may very well be “not very much”; our value structure is a lot messier than that of the theoretical entities for which the terminal/instrumental dichotomy works well, and if we had a good way of cleaning it up we wouldn’t need proposals like CEV.
People can change between egalitarian and hierarchical ethics without neurological insults or biochemical tinkering, so human “terminal” values clearly don’t necessitate one or the other. More importantly, though, CEV is not magic; it can resolve contradictions between the ethics you feed into it, and it might be able to find refinements of those ethics that our biases blind us to or that we’re just not smart enough to figure out, but it’s only as good as its inputs. In particular, it’s not guaranteed to find universal human values when evaluated over a subset of humanity.
If you took a collection of 19th-century slave owners and extrapolated their ethical preferences according to CEV-like rules, I wouldn’t expect that to spit out an ethic that allowed slavery—the historical arguments I’ve read for the practice didn’t seem very good—but I wouldn’t be hugely surprised if it did, either. Either way it wouldn’t imply that the resulting ethic applies to all humans or that it derives from immutable laws of rationality; it’d just tell us whether it’s possible to reconcile slavery with middle-and-upper-class 19th-century ethics without downstream contradictions.
“Saying that the coherent extrapolated volition of 19th-century slave owners would have been racist is equivalent to saying that either racism is justified by the facts, or the fundamental norms of rationality latent in 19th-century slave-owner cognition are radically unlike our contemporary fundamental norms of rationality.”
Could you elaborate on this please? If you’re saying what I think you’re saying then I would strongly like to argue against your point.
I don’t know much about CEV (I started to read Eliezer’s paper but I didn’t get very far), but I’m not sure it’s possible to extrapolate values like that. What if 19th-century slave owners hold white-people-are-better as a terminal value?
On the other hand, it does seem plausible that slave owner would oppose slavery if he weren’t himself a slave owner, so his CEV may indeed support racial equality. I simply don’t know enough about CEV or how to implement it to make a judgment one way or the other.
Terminal values can change with education. Saying that the coherent extrapolated volition of 19th-century slave owners would have been racist is equivalent to saying that either racism is justified by the facts, or the fundamental norms of rationality latent in 19th-century slave-owner cognition are radically unlike our contemporary fundamental norms of rationality. For instance, slave-owners don’t don’t on any deep level value consistency between their moral intuitions, or they assign zero weight to moral intuitions involving empathy.
If new experiences and rationality training couldn’t ever persuade a slave-owner to become an egalitarian, then I’m extremely confused by the fact that society has successfully eradicated the memes that restructured those slave-owners’ brains so quickly. Maybe I’m just more sanguine than most people about the possibility that new information can actually change people’s minds (including their values). Science doesn’t progress purely via the eradication of previous generations.
I’m not sure I’d agree with that framing. If an ethical feature changes with education, that’s good evidence that it’s not a terminal value, to whatever extent that it makes sense to talk about terminal values in humans. Which may very well be “not very much”; our value structure is a lot messier than that of the theoretical entities for which the terminal/instrumental dichotomy works well, and if we had a good way of cleaning it up we wouldn’t need proposals like CEV.
People can change between egalitarian and hierarchical ethics without neurological insults or biochemical tinkering, so human “terminal” values clearly don’t necessitate one or the other. More importantly, though, CEV is not magic; it can resolve contradictions between the ethics you feed into it, and it might be able to find refinements of those ethics that our biases blind us to or that we’re just not smart enough to figure out, but it’s only as good as its inputs. In particular, it’s not guaranteed to find universal human values when evaluated over a subset of humanity.
If you took a collection of 19th-century slave owners and extrapolated their ethical preferences according to CEV-like rules, I wouldn’t expect that to spit out an ethic that allowed slavery—the historical arguments I’ve read for the practice didn’t seem very good—but I wouldn’t be hugely surprised if it did, either. Either way it wouldn’t imply that the resulting ethic applies to all humans or that it derives from immutable laws of rationality; it’d just tell us whether it’s possible to reconcile slavery with middle-and-upper-class 19th-century ethics without downstream contradictions.
“Saying that the coherent extrapolated volition of 19th-century slave owners would have been racist is equivalent to saying that either racism is justified by the facts, or the fundamental norms of rationality latent in 19th-century slave-owner cognition are radically unlike our contemporary fundamental norms of rationality.”
Could you elaborate on this please? If you’re saying what I think you’re saying then I would strongly like to argue against your point.
You might also like Brian Tomasik’s critique of CEV