If the AI is friendly, then the technique I am using already produces a friendly AI, and I thus learn nothing more than how to prove that it is friendly.
But if the AI is unfriendly, the proof will be subtly corrupt, so I can’t actually count the proof as any evidence of friendliness, since both a FAI and UFAI can offer me exactly the same thing.
If the AI is friendly, then the technique I am using already produces a friendly AI, and I thus learn nothing more than how to prove that it is friendly.
But if the AI is unfriendly, the proof will be subtly corrupt, so I can’t actually count the proof as any evidence of friendliness, since both a FAI and UFAI can offer me exactly the same thing.