A couple of points: one, you are right to note that very few algorithms can be analyzed without executing them, the reason is the same as that most numbers are not algebraic, most sequences are not compressible (with an a priory compressor) etc. That is why computer programs are buggy and the bugs are not easy to find: any deviation from intention pushes an intended algorithm into an incompressible territory. Two, the conclusion that “AGI is inherently unsafe” not just “inherently unpredictable”, relies on an unstated definition of safety. If you agree with Eliezer and others that “safety we care about” is a measure zero among all possible superintelligent AIs, then yes, your Fire thesis follows. However, this point is often taken on faith (in AI doomers), and has not been formalized, as far as I know. Eliezer’s argument is basically “there are many disjunctive ways to AI killing everyone, and probably a small number of conjunctive ways to avoid that”. While believable, it is far from proven or even formalized.
I have problems getting the first point. If bugs are hard to find then shouldn’t this precisely entail that dangerous AI is hard to differentiate from benign AI?! Any literature you can suggest on the subject?
Regarding the second point. I don’t find Eliezer’s idea entirely convincing. But I don’t think the fire thesis hinges on his view. Rather, it is built on the much weaker and simpler view that if we don’t know the utility function of some AGI system then this system is dangerous—I find it very hard to see any convincing reasons for thinking this is false. Eliezer thinks doom is default. I just assume that ignorance makes it rational to air on the side of caution.
A couple of points: one, you are right to note that very few algorithms can be analyzed without executing them, the reason is the same as that most numbers are not algebraic, most sequences are not compressible (with an a priory compressor) etc. That is why computer programs are buggy and the bugs are not easy to find: any deviation from intention pushes an intended algorithm into an incompressible territory. Two, the conclusion that “AGI is inherently unsafe” not just “inherently unpredictable”, relies on an unstated definition of safety. If you agree with Eliezer and others that “safety we care about” is a measure zero among all possible superintelligent AIs, then yes, your Fire thesis follows. However, this point is often taken on faith (in AI doomers), and has not been formalized, as far as I know. Eliezer’s argument is basically “there are many disjunctive ways to AI killing everyone, and probably a small number of conjunctive ways to avoid that”. While believable, it is far from proven or even formalized.
I have problems getting the first point. If bugs are hard to find then shouldn’t this precisely entail that dangerous AI is hard to differentiate from benign AI?! Any literature you can suggest on the subject?
Regarding the second point. I don’t find Eliezer’s idea entirely convincing. But I don’t think the fire thesis hinges on his view. Rather, it is built on the much weaker and simpler view that if we don’t know the utility function of some AGI system then this system is dangerous—I find it very hard to see any convincing reasons for thinking this is false. Eliezer thinks doom is default. I just assume that ignorance makes it rational to air on the side of caution.