But why should we err at all? Should we not, rather, use as many carrots and sticks as is optimal?
“Err on the side of X” here doesn’t mean “prefer erring over optimality”; it means “prefer errors in direction X over errors in the other direction”. This is still vague, since it doesn’t say how much to care about this difference; but it’s not trivial advice (or trivially mistaken).
Yes, I know what the expression means. But that doesn’t answer the objection, which is “why are we concerning ourselves with the direction of the errors, when our objective should be to not have errors?”
The actual answer has already been given elsethread (a situation where changing the sign of the error is substantially easier than reducing magnitude of error, plus a payoff matrix that is asymmetric w.r.t. the direction of error).
“Err on the side of X” here doesn’t mean “prefer erring over optimality”; it means “prefer errors in direction X over errors in the other direction”. This is still vague, since it doesn’t say how much to care about this difference; but it’s not trivial advice (or trivially mistaken).
Yes, I know what the expression means. But that doesn’t answer the objection, which is “why are we concerning ourselves with the direction of the errors, when our objective should be to not have errors?”
The actual answer has already been given elsethread (a situation where changing the sign of the error is substantially easier than reducing magnitude of error, plus a payoff matrix that is asymmetric w.r.t. the direction of error).