A prime example of what (I believe) Yudkowsky is talking about in this bullet point is Social Desirability Bias.
“What is the highest cost we are willing to spend in order to save a single child dying from leukemia ?”. Obviously the correct answer is not infinite. Obviously teaching an AI that the answer to this class of questions is “infinite” is lethal. Also, incidentally, most humans will reply “infinite” to this question.
A prime example of what (I believe) Yudkowsky is talking about in this bullet point is Social Desirability Bias.
“What is the highest cost we are willing to spend in order to save a single child dying from leukemia ?”. Obviously the correct answer is not infinite. Obviously teaching an AI that the answer to this class of questions is “infinite” is lethal. Also, incidentally, most humans will reply “infinite” to this question.