It would add some possibly-useful context to this review if you explained why you came to it with an axe to grind. (Just as race is both possibly-useful information and a possible source of prejudice to correct for, so also with your prior prejudices about this book.)
Much of the dialogue about AI Safety I encounter in off-the-record conversations seems to me like it’s not grounded in reality. I repeatedly hear (what I feel to be) a set of shaky arguments that both shut down conversation and are difficult to validate empirically.
The shaky argument is as follows:
Machine learning is rapidly growing more powerful. If trends continue it will soon eclipse human performance.
Machine learning equals artificial intelligence equals world optimizer.
World optimizers can easily turn the universe into paperclips by accident.
Therefore we need to halt machine learning advancement until the abstract philosophical + mathematical puzzle of AI alignment is solved.
I am not saying this line of reasoning is what AI researchers believe or that it’s mainstream (among the rationality/alignment communities)―or even that it’s wrong. The argument annoys me for the same reason a popular-yet-incoherent political platform annoys me; I have encountered badly-argued versions of the idea too many times.
I agree with #1, though I quibble “absolute power” should be distinguished from “sample efficiency” as well as how we’ll get to superintelligence. (I am bearish on applying the scaling hypothesis to existing architectures.) I agree with #3 in theory. Theory is often very different from practice. I disagree with #2 because it relies on the tautological equivalence of two definitions. I can imagine superintelligent machines that aren’t world optimizers. Without #2 the argument falls apart. It might be easy to build a superintelligence but hard to build a world optimizer.
I approached The Alignment Problem with the (incorrect) prior that it would be more vague abstract arguments untethered from technical reality. Instead, the book was dominated by ideas that have passed practical empirical tests.
It would add some possibly-useful context to this review if you explained why you came to it with an axe to grind. (Just as race is both possibly-useful information and a possible source of prejudice to correct for, so also with your prior prejudices about this book.)
Much of the dialogue about AI Safety I encounter in off-the-record conversations seems to me like it’s not grounded in reality. I repeatedly hear (what I feel to be) a set of shaky arguments that both shut down conversation and are difficult to validate empirically.
The shaky argument is as follows:
Machine learning is rapidly growing more powerful. If trends continue it will soon eclipse human performance.
Machine learning equals artificial intelligence equals world optimizer.
World optimizers can easily turn the universe into paperclips by accident.
Therefore we need to halt machine learning advancement until the abstract philosophical + mathematical puzzle of AI alignment is solved.
I am not saying this line of reasoning is what AI researchers believe or that it’s mainstream (among the rationality/alignment communities)―or even that it’s wrong. The argument annoys me for the same reason a popular-yet-incoherent political platform annoys me; I have encountered badly-argued versions of the idea too many times.
I agree with #1, though I quibble “absolute power” should be distinguished from “sample efficiency” as well as how we’ll get to superintelligence. (I am bearish on applying the scaling hypothesis to existing architectures.) I agree with #3 in theory. Theory is often very different from practice. I disagree with #2 because it relies on the tautological equivalence of two definitions. I can imagine superintelligent machines that aren’t world optimizers. Without #2 the argument falls apart. It might be easy to build a superintelligence but hard to build a world optimizer.
I approached The Alignment Problem with the (incorrect) prior that it would be more vague abstract arguments untethered from technical reality. Instead, the book was dominated by ideas that have passed practical empirical tests.
Thanks! (I would not have guessed correctly.)