I agree with Nominull, a good number of lies are undetectable without having access to some sort of lie detector or the agent’s source code. If an AI wanted to lie “my recursive modification of my goal systems hasn’t led me to accept a goal that involves eventually destroying all human life” I don’t see any way we could bust that lie via the ‘Web’ until the AI was actively pursuing that goal. I value honesty not for the trouble it saves me but because I find (sometimes only hope) that the real world free of distortion is more interesting than any misrepresentation humans can conjure for selfish means.
I agree with Nominull, a good number of lies are undetectable without having access to some sort of lie detector or the agent’s source code. If an AI wanted to lie “my recursive modification of my goal systems hasn’t led me to accept a goal that involves eventually destroying all human life” I don’t see any way we could bust that lie via the ‘Web’ until the AI was actively pursuing that goal. I value honesty not for the trouble it saves me but because I find (sometimes only hope) that the real world free of distortion is more interesting than any misrepresentation humans can conjure for selfish means.