This general argument of “the algorithm you claim to be using to make moral decisions might fail on some edge cases, therefore it is bad” strikes me as disingenuous. Do you have an algorithm you use to make moral decisions that doesn’t have this property?
Actually, I do. I try to rely on System 1 as little as possible when it comes to figuring out my terminal value(s). One reason for that, I guess, is that at some point I started out with the premise that I don’t want to be the sort of person that would have been racist or sexist in previous centuries. If you don’t share that premise, there is no way for me to show that you’re being inconsistent—I acknowledge that.
Actually, I do. I try to rely on System 1 as little as possible when it comes to figuring out my terminal value(s). One reason for that, I guess, is that at some point I started out with the premise that I don’t want to be the sort of person that would have been racist or sexist in previous centuries. If you don’t share that premise, there is no way for me to show that you’re being inconsistent—I acknowledge that.
Wow! So you’ve solved friendly AI? Eliezer will be happy to hear that.
I’m pretty sure Eliezer already knew our brains contained the basis of morality.