Like nobody except EY and a bunch of core MIRI people actually believes that prosaic alignment is impossible. I mean that every other researcher that I know think Prosaic Alignment is possible, even if potentially very hard. That includes MIRI people like Evan Hubinger too. And note that some of these other alignment researchers actually work with Neural Nets and keep up to speed on the implementation details and subtleties, which in my book means their voice should count more.
I don’t get the impression that Eliezer’s saying that alignment of prosaic AI is impossible. I think he’s saying “it’s almost certainly not going to happen because humans are bad at things.” That seems compatible with “every other researcher that I know think Prosaic Alignment is possible, even if potentially very hard” (if you go with the “very hard” part).
Yes, +1 to this; I think it’s important to distinguish between impossible (which is a term I carefully avoided using in my earlier comment, precisely because of its theoretical implications) and doomed (which I think of as a conjunction of theoretical considerations—how hard is this problem?--and social/coordination ones—how likely is it that humans will have solved this problem before solving AGI?).
I currently view this as consistent with e.g. Eliezer’s claim that Chris Olah’s work, though potentially on a pathway to something important, is probably going to accomplish “far too little far too late”. I certainly didn’t read it as anything like an unconditional endorsement of Chris’ work, as e.g. this comment seems to imply.
Ditto—the first half makes it clear that any strategy which isn’t at most 2 years slower than an unaligned approach will be useless, and that prosaic AI safety falls into that bucket.
I don’t get the impression that Eliezer’s saying that alignment of prosaic AI is impossible. I think he’s saying “it’s almost certainly not going to happen because humans are bad at things.” That seems compatible with “every other researcher that I know think Prosaic Alignment is possible, even if potentially very hard” (if you go with the “very hard” part).
Yes, +1 to this; I think it’s important to distinguish between impossible (which is a term I carefully avoided using in my earlier comment, precisely because of its theoretical implications) and doomed (which I think of as a conjunction of theoretical considerations—how hard is this problem?--and social/coordination ones—how likely is it that humans will have solved this problem before solving AGI?).
I currently view this as consistent with e.g. Eliezer’s claim that Chris Olah’s work, though potentially on a pathway to something important, is probably going to accomplish “far too little far too late”. I certainly didn’t read it as anything like an unconditional endorsement of Chris’ work, as e.g. this comment seems to imply.
Ditto—the first half makes it clear that any strategy which isn’t at most 2 years slower than an unaligned approach will be useless, and that prosaic AI safety falls into that bucket.