I guess this doesn’t fit with the use in the Truthful AI paper that you quote. Also in that case I have an objection that only punishing for negligence may incentivize an AI to lie in cases where it knows the truth but thinks the human thinks the AI doesn’t/can’t know the truth, compared to a “strict liability” regime.
I guess this doesn’t fit with the use in the Truthful AI paper that you quote. Also in that case I have an objection that only punishing for negligence may incentivize an AI to lie in cases where it knows the truth but thinks the human thinks the AI doesn’t/can’t know the truth, compared to a “strict liability” regime.