Eliezer, you write as if there is no alternative to this plan, as if your hand is forced. But that’s exactly what some people believe about neural networks. What about first understanding human morality and moral growth, enough so that we (not an AI) can deduce and fully describe someone’s morality (from his brain scan, or behavior, or words) and predict his potential moral growth in various circumstances, and maybe enough to correct any flaws that we see either in the moral content or in the growth process, and finally program the seed AI’s morality and moral growth based on that understanding once we’re convinced it’s sufficiently good? Your logic of (paraphrasing) “this information exists only in someone’s brain so I must let the AI grab it directly without attempting to understand it myself” simply makes no sense. First the conclusion doesn’t follow from the premise, and second if you let the AI grab and extrapolate the information without understanding it yourself, there is no way you can predict a positive outcome.
In case people think I’m some kind of moralist for harping on this so much, I think there are several other aspects of intelligence that are not captured by the notion of “optimization”. I gave some examples here. We need to understand all aspects of intelligence, not just the first facet for which we have a good theory, before we can try to build a truly Friendly AI.
(Eliezer, why do you keep using “intelligence” to mean “optimization” even after agreeing with me that intelligence includes other things that we don’t yet understand?)
Morality does not compress
You can’t mean that morality literally does not compress (i.e. is truly random). Obviously there are plenty of compressible regularities in human morality. So perhaps what you mean is that it’s too hard or impossible to compress it into a small enough description that humans can understand. But, we also have no evidence that effective universal optimization in the presence of real-world computational constraints (as opposed to idealized optimization with unlimited computing power) can be compressed into a small enough description that humans can understand.