When answer is robustly unattainable, it’s pointless to speculate what it might be, you can only bet or build conditional plans. If “values” are “simple”, but you don’t know that, your state of knowledge about the “values” remains non-simple, and that is what you impart to AI. What does it matter which of these things, the confused state of knowledge or the correct answer to our confusion, do we call “values”?
If I think the correct answer to our confusion will ultimately turn out to be something complex (in the sense of Godshatter-like), then I can rule out any plans that eventually call for hard coding such an answer into an AI. This seems to be Eliezer’s argument (or one of his main arguments) for implementing CEV.
On the other hand, if I think the correct answer may turn out to be simple, even if I don’t know what it is now, then there’s a chance I can find out the answer directly in the next few decades and then hard code that answer into an AI. Something like CEV is no longer the obvious best approach.
(Personally I still prefer a “meta-ethical” or “meta-philosophical” approach, but we’d need a different argument for it besides “thou art godshatter”/”complexity of value”.)
When answer is robustly unattainable, it’s pointless to speculate what it might be, you can only bet or build conditional plans. If “values” are “simple”, but you don’t know that, your state of knowledge about the “values” remains non-simple, and that is what you impart to AI. What does it matter which of these things, the confused state of knowledge or the correct answer to our confusion, do we call “values”?
If I think the correct answer to our confusion will ultimately turn out to be something complex (in the sense of Godshatter-like), then I can rule out any plans that eventually call for hard coding such an answer into an AI. This seems to be Eliezer’s argument (or one of his main arguments) for implementing CEV.
On the other hand, if I think the correct answer may turn out to be simple, even if I don’t know what it is now, then there’s a chance I can find out the answer directly in the next few decades and then hard code that answer into an AI. Something like CEV is no longer the obvious best approach.
(Personally I still prefer a “meta-ethical” or “meta-philosophical” approach, but we’d need a different argument for it besides “thou art godshatter”/”complexity of value”.)