Every statement an AI tells us will be a lie to some extent, simply in terms of being a simplification so that we can understand it. If we end up selecting against simplifications that reveal nefarious plans...
But the narrow AI I had above might not even be capable of lying—it might just simply spit out the drug design, with a list of estimated improvements according to the criteria it’s been given, without anyone ever realising that “reduced mortality” was code for “everyone’s dead already”.
How do you convincingly lie without having the capability to think up a convincing lie?
Think you’re telling the truth.
Or be telling the truth, but be misinterpreted.
Every statement an AI tells us will be a lie to some extent, simply in terms of being a simplification so that we can understand it. If we end up selecting against simplifications that reveal nefarious plans...
But the narrow AI I had above might not even be capable of lying—it might just simply spit out the drug design, with a list of estimated improvements according to the criteria it’s been given, without anyone ever realising that “reduced mortality” was code for “everyone’s dead already”.
Not so. You can definitely ask questions about complicated things that have simple answers.
Yes, that was an exaggeration—I was thinking of most real-world questions.
I was thinking of most real-world questions that aren’t of the form ‘Why X?’ or ‘How do I X?’.
“How much/many X?” → number
“When will X?” → number
“Is X?” → boolean
“What are the chances of X if I Y?” → number
Also, any answer that simplifies isn’t a lie if its simplified status is made clear.