All human preferences, in their exact form, are hidden. The complexity of human value is too great to comprehend it all explicitly with a merely human mind.
It’s especially hard if you use models based on utility maximizing rather than on predicted error minimization, or if you assume that human values are coherent even within a given individual, let alone humanity as a whole.
That being said, it is certainly possible to map a subset of one’s preferences as they pertain to some specific subject, and to do a fair amount of pruning and tuning. One’s preferences are not necessarily opaque to reflection; they’re mostly just nonobvious.
It’s especially hard if you use models based on utility maximizing rather than on predicted error minimization, or if you assume that human values are coherent even within a given individual, let alone humanity as a whole.
That being said, it is certainly possible to map a subset of one’s preferences as they pertain to some specific subject, and to do a fair amount of pruning and tuning. One’s preferences are not necessarily opaque to reflection; they’re mostly just nonobvious.