As I understand it, Eliezer’s concept of precision involves trying to find formalizations that are provably unique or optimal in some sense, not just formalizations that work. For example, Bayesianism is a unique solution under the conditions of Cox’s theorem. One of Pei’s papers points out flaws in Bayesianism and proposes an alternative approach, but without any proof of uniqueness. I see that as an avoidable mistake.
I guess Pei’s intuition is that a proof of uniqueness or optimality under unrealistic assumptions is of little practical value, and doing such proofs under realistic assumptions is unfeasible compared to the approach he is taking.
ETA: When you write most kinds of software, you don’t first prove that your design is optimal or unique, but just start with something that you intuitively think would work, and then refine it by trial and error. Why shouldn’t this work for AGI?
ETA 2: In case it wasn’t clear, I’m not advocating that we build AGIs by trial and error, but just trying to explain what Pei is probably thinking, and why cousin_it’s link isn’t likely to be convincing for him.
If one isn’t concerned about the AGI’s ability to either (a) be able out-of-the-box to either successfully subvert the testing mechanisms being applied to it or successfully neutralize whatever mechanisms are in place to deal with it if it “fails” those tests, or (b) self-improve rapidly enough to achieve that state before those testing or dealing-with-failures mechanisms apply, then sure, a sufficiently well-designed test harness around some plausible-but-not-guaranteed algorithms will work fine, as it does for most software.
Of course, if one is concerned about an AGI’s ability to do either of those things, one may not wish to rely on such a test harness.
It seems to follow from this that some kind of quantification of what kinds of algorithms can do either of those things, and whether there’s any way to reliably determine whether a particular algorithm falls into that set prior to implementing it, might allow AGI developers to do trial-and-error work on algorithms that provably don’t meet that standard would be one way of making measurable progress without arousing the fears of those who consider FOOMing algorithms a plausible existential risk.
Of course, that doesn’t have the “we get it right and then everything is suddenly better” aspect of successfully building a FOOMing FAI… it’s just research and development work, the same sort of incremental collective process that has resulted in, well, pretty much all human progress to date.
I guess he was talking about the kind of precision more specific to AI, which goes like “compared to the superintelligence space, the Friendly AI space is a tiny dot. We should aim precisely, at the first try, or else”.
And because the problem is impossible to solve, you have to think precisely in the first place, or you won’t be able to aim precisely at the first try. (Because whatever Skynet we build won’t give us the shadow of a second chance).
I guess he was talking about the kind of precision more specific to AI, which goes like “compared to the superintelligence space, the Friendly AI space is a tiny dot. We should aim precisely, at the first try, or else”.
Compared to the space of possible 747 component configurations, the space of systems representing working 747s is tiny. We should aim precisely, at the first try, or else!
Well, yes. But to qualify as a super-intelligence, a system have to have optimization power way beyond a mere human. This is no small feat, but still, the fraction of AIs that do what we would want compared to the ones that would do something else (crushing us like a car does an insect in the process) is likely tiny.
A 747 analogy that would work for me would be that on the first try, you have to set the 747 full of people at high altitude. Here, the equivalent of “or else” would be “the 747 falls like an anvil and everybody dies”.
Sure, one can think of ways to test an AI before setting it lose, but beware that if it’s more intelligent than you, it will outsmart you the instant you give it the opportunity. No matter what, the first real test flight will be full of passengers.
Well, nobody is starting out with a superintelligence. We are starting out with sub-human intelligence. A superhuman intelligence is bound to evolve gradually.
No matter what, the first real test flight will be full of passengers.
It didn’t work that way with 747s. They did loads of testing before risking hundreds of lives.
No matter what, the first real test flight will be full of passengers.
It didn’t work that way with 747s. They did loads of testing before risking hundreds of lives.
747s aren’t smart enough to behave differently when they do or don’t have passengers. If the AI might be behaving differently when it’s boxed then unboxed, then any boxed test isn’t “real”; unboxed tests “have passengers”.
I guess he was talking about the kind of precision more specific to AI, which goes like “compared to the superintelligence space, the Friendly AI space is a tiny dot.
But that stance makes assumptions that he does not share, as he does not believe that AGI will become uncontrollable.
Eliezer’s epiphany about precision, which I completely subscribe to, negates most of Pei’s arguments for me.
I have read the post and I still don’t understand what you mean.
As I understand it, Eliezer’s concept of precision involves trying to find formalizations that are provably unique or optimal in some sense, not just formalizations that work. For example, Bayesianism is a unique solution under the conditions of Cox’s theorem. One of Pei’s papers points out flaws in Bayesianism and proposes an alternative approach, but without any proof of uniqueness. I see that as an avoidable mistake.
I guess Pei’s intuition is that a proof of uniqueness or optimality under unrealistic assumptions is of little practical value, and doing such proofs under realistic assumptions is unfeasible compared to the approach he is taking.
ETA: When you write most kinds of software, you don’t first prove that your design is optimal or unique, but just start with something that you intuitively think would work, and then refine it by trial and error. Why shouldn’t this work for AGI?
ETA 2: In case it wasn’t clear, I’m not advocating that we build AGIs by trial and error, but just trying to explain what Pei is probably thinking, and why cousin_it’s link isn’t likely to be convincing for him.
If one isn’t concerned about the AGI’s ability to either (a) be able out-of-the-box to either successfully subvert the testing mechanisms being applied to it or successfully neutralize whatever mechanisms are in place to deal with it if it “fails” those tests, or (b) self-improve rapidly enough to achieve that state before those testing or dealing-with-failures mechanisms apply, then sure, a sufficiently well-designed test harness around some plausible-but-not-guaranteed algorithms will work fine, as it does for most software.
Of course, if one is concerned about an AGI’s ability to do either of those things, one may not wish to rely on such a test harness.
It seems to follow from this that some kind of quantification of what kinds of algorithms can do either of those things, and whether there’s any way to reliably determine whether a particular algorithm falls into that set prior to implementing it, might allow AGI developers to do trial-and-error work on algorithms that provably don’t meet that standard would be one way of making measurable progress without arousing the fears of those who consider FOOMing algorithms a plausible existential risk.
Of course, that doesn’t have the “we get it right and then everything is suddenly better” aspect of successfully building a FOOMing FAI… it’s just research and development work, the same sort of incremental collective process that has resulted in, well, pretty much all human progress to date.
I guess he was talking about the kind of precision more specific to AI, which goes like “compared to the superintelligence space, the Friendly AI space is a tiny dot. We should aim precisely, at the first try, or else”.
And because the problem is impossible to solve, you have to think precisely in the first place, or you won’t be able to aim precisely at the first try. (Because whatever Skynet we build won’t give us the shadow of a second chance).
Compared to the space of possible 747 component configurations, the space of systems representing working 747s is tiny. We should aim precisely, at the first try, or else!
Well, yes. But to qualify as a super-intelligence, a system have to have optimization power way beyond a mere human. This is no small feat, but still, the fraction of AIs that do what we would want compared to the ones that would do something else (crushing us like a car does an insect in the process) is likely tiny.
A 747 analogy that would work for me would be that on the first try, you have to set the 747 full of people at high altitude. Here, the equivalent of “or else” would be “the 747 falls like an anvil and everybody dies”.
Sure, one can think of ways to test an AI before setting it lose, but beware that if it’s more intelligent than you, it will outsmart you the instant you give it the opportunity. No matter what, the first real test flight will be full of passengers.
Well, nobody is starting out with a superintelligence. We are starting out with sub-human intelligence. A superhuman intelligence is bound to evolve gradually.
It didn’t work that way with 747s. They did loads of testing before risking hundreds of lives.
747s aren’t smart enough to behave differently when they do or don’t have passengers. If the AI might be behaving differently when it’s boxed then unboxed, then any boxed test isn’t “real”; unboxed tests “have passengers”.
Sure, but that’s no reason not to test. It’s a reason to try and make the tests realistic.
The point is not that we shouldn’t test. The point is that tests alone don’t give us the assurances we need.
But that stance makes assumptions that he does not share, as he does not believe that AGI will become uncontrollable.