Consider a system that simply accepts input information and integrates it into a huge probability distribution that it maintains. We can then query the oracle by simply examining this distribution. For example, we could use this distribution to estimate the probability of some event in the future conditional on some other event etc.
So the system literally has no internal optimization pressures which are capable of producing new internal programs? Well… I’m not going to say that it’s impossible for a human to make such a device, because that’s the knee-jerk “I’d rather not have to think about it” that people use to dismiss Friendly AI as too difficult. Perhaps, if I examined the problem for a while, I would come up with something.
However, a superintelligence operating in this mode has to be able to infer arbitrary programs to describe its own environment, and run those programs to generate deductions. What about modeling the future, or subjunctive conditions? Can this Oracle AI answer questions like “What would a typical UnFriendly AI do?” and if so, does its “probability distribution” contain a running UnFriendly AI? By hypothesis, this Oracle was built by humans, so the sandbox of its “probability distributions” (environmental models containing arbitrary running programs) may be flawed; or the UnFriendly AI may be able to create information within its program that would tempt or hack a human, examining the probability distribution...
I am extremely doubtful of the concept of a passively safe superintelligence, in any form.
So the system literally has no internal optimization pressures which are capable of producing new internal programs? Well… I’m not going to say that it’s impossible for a human to make such a device, because that’s the knee-jerk “I’d rather not have to think about it” that people use to dismiss Friendly AI as too difficult. Perhaps, if I examined the problem for a while, I would come up with something.
However, a superintelligence operating in this mode has to be able to infer arbitrary programs to describe its own environment, and run those programs to generate deductions. What about modeling the future, or subjunctive conditions? Can this Oracle AI answer questions like “What would a typical UnFriendly AI do?” and if so, does its “probability distribution” contain a running UnFriendly AI? By hypothesis, this Oracle was built by humans, so the sandbox of its “probability distributions” (environmental models containing arbitrary running programs) may be flawed; or the UnFriendly AI may be able to create information within its program that would tempt or hack a human, examining the probability distribution...
I am extremely doubtful of the concept of a passively safe superintelligence, in any form.
Can an FAI model a UFAI more powerful than itself? If not, why shouldn’t it be able to keep a weaker one boxed?