I think we may be using different definitions of “value”. There’s a “value” like “what this agent is optimizing for right now”, and a “value” like “the cognitive structure that we’d call this agent’s terminal values if we looked at a comprehensive model of that agent”. I’m talking about the second type.
And e. g. the Divine Command model of morality especially has nothing to do with that second type. It’s explicitly “I value the things God says to value because He says to value them”. Divine Command values are explicitly instrumental, not terminal.
I think we may be using different definitions of “value”. There’s a “value” like “what this agent is optimizing for right now”, and a “value” like “the cognitive structure that we’d call this agent’s terminal values if we looked at a comprehensive model of that agent”. I’m talking about the second type.
And e. g. the Divine Command model of morality especially has nothing to do with that second type. It’s explicitly “I value the things God says to value because He says to value them”. Divine Command values are explicitly instrumental, not terminal.
Under your latter definition, could an agent be surprised by learning what its values are?
Yes, very much so.