Ok let’s try this as a solution: All our neat little mechanisms and heuristics make up our values, but they come on a continuum of importance, and some of them sabotage the rest more than others.
For example, all those nice things like love and beauty seem very important, and usually don’t conflict, so they are closer to values.
Things like risk aversion and hindsight bias and such aren’t terribly important, but because they prescribe otherwise stupid behavior in the decision theory/epistemological realm, they sabotage the achievement of other bias/values, and are therefore a net negative.
This can work for the high-value things like love and beauty and freedom as well: Say you are designing a machine that will achieve many of your values, being biased towards making it beautiful over functional could sabotage achievement of other values. Being biased against having powerful agents interfering with freedom can prevent you from accepting law or safety.
So debiasing is knowing how and when to override less important “values” for the sake of more important ones, like overriding your aversion to cold calculation to maximize lives saved in a shut up and multiply situation.
EDIT: This is all parachuted into the end of the OP /EDIT
Ok let’s try this as a solution: All our neat little mechanisms and heuristics make up our values, but they come on a continuum of importance, and some of them sabotage the rest more than others.
For example, all those nice things like love and beauty seem very important, and usually don’t conflict, so they are closer to values.
Things like risk aversion and hindsight bias and such aren’t terribly important, but because they prescribe otherwise stupid behavior in the decision theory/epistemological realm, they sabotage the achievement of other bias/values, and are therefore a net negative.
This can work for the high-value things like love and beauty and freedom as well: Say you are designing a machine that will achieve many of your values, being biased towards making it beautiful over functional could sabotage achievement of other values. Being biased against having powerful agents interfering with freedom can prevent you from accepting law or safety.
So debiasing is knowing how and when to override less important “values” for the sake of more important ones, like overriding your aversion to cold calculation to maximize lives saved in a shut up and multiply situation.
EDIT: This is all parachuted into the end of the OP /EDIT