there’s no underlying reason for me to accept that I should accept CEV as the best implementation of my values
Sure.
And even if I did accept CEV(humanity) as the best implementation of my values in principle, the question of what grounds I had to believe that any particular formally specified value system that was generated as output by some seed AI actually was CEV(humanity) is also worth asking.
Then again, there’s no underlying reason for me to accept that I should accept my current collection of habits and surface-level judgments and so forth as the best implementation of my values, either.
So, OK, at some point I’ve got a superhuman value-independent optimizer all rarin’ to go, and the only question is what formal specification of a set of values I ought to provide it with. So, what do I pick, and why do I pick it?
Then again, there’s no underlying reason for me to accept that I should accept my current collection of habits and surface-level judgments and so forth as the best implementation of my values, either.
Isn’t this begging the question? By ‘my values’ I’m pretty sure I literally mean ‘my current collection of habits and surface-level judgements and so forth’.
Could I have terminal values of which I am completely unaware in any way shape or form? How would I even recognize such things, and what reason do I have to prefer them over ‘my values’.
Well, you tell me: if I went out right now and magically altered the world to reflect your current collection of habits and surface-level judgments, do you think you would endorse the result?
I’m pretty sure I wouldn’t, if the positions were reversed.
I would want you to change the world so that what I want is actualized, yes. If you wouldn’t endorse an alteration of the world towards your current values, in what sense do you really ‘value’ said values?
I don’t know if you need to taboo it or not, but I’ll point out that I asked you a question that didn’t use that word, and you answered a question that did.
So perhaps a place to start is by answering the question I asked in the terms that I asked it?
Sure.
And even if I did accept CEV(humanity) as the best implementation of my values in principle, the question of what grounds I had to believe that any particular formally specified value system that was generated as output by some seed AI actually was CEV(humanity) is also worth asking.
Then again, there’s no underlying reason for me to accept that I should accept my current collection of habits and surface-level judgments and so forth as the best implementation of my values, either.
So, OK, at some point I’ve got a superhuman value-independent optimizer all rarin’ to go, and the only question is what formal specification of a set of values I ought to provide it with. So, what do I pick, and why do I pick it?
Isn’t this begging the question? By ‘my values’ I’m pretty sure I literally mean ‘my current collection of habits and surface-level judgements and so forth’.
Could I have terminal values of which I am completely unaware in any way shape or form? How would I even recognize such things, and what reason do I have to prefer them over ‘my values’.
Did I just go in a circle?
Well, you tell me: if I went out right now and magically altered the world to reflect your current collection of habits and surface-level judgments, do you think you would endorse the result?
I’m pretty sure I wouldn’t, if the positions were reversed.
I would want you to change the world so that what I want is actualized, yes. If you wouldn’t endorse an alteration of the world towards your current values, in what sense do you really ‘value’ said values?
I’m going to need to taboo ‘value’, aren’t I?
I don’t know if you need to taboo it or not, but I’ll point out that I asked you a question that didn’t use that word, and you answered a question that did.
So perhaps a place to start is by answering the question I asked in the terms that I asked it?