That all makes sense, but I’m missing the link between the above understanding of ‘formal’ and these four claims, if they’re what you were trying to say before:
(1) Indirect indirect normativity is less formal, in the relevant sense, than indirect normativity. I.e., because we’re incorporating more of human natural language into the AI’s decision-making, the reasoning system will be more tolerant of local errors, uncertainty, and noise.
(2) Programming an AI to value humans’ True Preferences in general (indirect normativity) has many pitfalls that programming an AI to value humans’ instructions’ True Meanings in general (indirect indirect normativity) doesn’t, because the former is more formal.
(3) “‘Tell the AI in English’ can fail, but the worst case is closer to a ‘With Folded Hands’ scenario than to paperclips.”
(4) The “With Folded Hands”-style scenario I have in mind is not as terrible as the paperclips scenario.
That all makes sense, but I’m missing the link between the above understanding of ‘formal’ and these four claims, if they’re what you were trying to say before:
(1) Indirect indirect normativity is less formal, in the relevant sense, than indirect normativity. I.e., because we’re incorporating more of human natural language into the AI’s decision-making, the reasoning system will be more tolerant of local errors, uncertainty, and noise.
(2) Programming an AI to value humans’ True Preferences in general (indirect normativity) has many pitfalls that programming an AI to value humans’ instructions’ True Meanings in general (indirect indirect normativity) doesn’t, because the former is more formal.
(3) “‘Tell the AI in English’ can fail, but the worst case is closer to a ‘With Folded Hands’ scenario than to paperclips.”
(4) The “With Folded Hands”-style scenario I have in mind is not as terrible as the paperclips scenario.