The self-improving AI will not suddenly appear; I would expect a number of different stages of increasingly powerful sub-self-improving AIs with a decreasing amount of direct human interaction. The key would be to use formal methods and theorem proving, and ensure that each stage can be formally proved by the stage below it.
Since even formal proofs / theorem provers could contain bugs, using parallel teams (as Gwern mentions) can reduce that risk.
The FAI (or UFAI) level seems much too advanced for any human to comprehend directly, let alone understand its friendliness.
The self-improving AI will not suddenly appear; I would expect a number of different stages of increasingly powerful sub-self-improving AIs with a decreasing amount of direct human interaction. The key would be to use formal methods and theorem proving, and ensure that each stage can be formally proved by the stage below it.
Since even formal proofs / theorem provers could contain bugs, using parallel teams (as Gwern mentions) can reduce that risk.
The FAI (or UFAI) level seems much too advanced for any human to comprehend directly, let alone understand its friendliness.