For all I know you are right about Yudkowsky’s pre-2011 view about deep math. However, (a) that wasn’t Bostrom’s view AFAICT, and (b) I think that’s just not what this OP quote is talking about. From the OP:
I feel like a bunch of people have shifted a bunch in the type of AI x-risk that worries them (representative phrase is “from Yudkowsky/Bostrom to What Failure Looks Like part 2 part 1”) and I still don’t totally get why.
It’s Yudkowsky/Bostrom, not Yudkowsky. And it’s WFLLp1, not p2. Part 2 is the one where the AIs do a treacherous turn; part 1 is where actually everything is fine except that “you get what you measure” and our dumb obedient AIs are optimizing for the things we told them to optimize for rather than for what we want.
I am pretty confident that WFLLp1 is not the main thing we should be worrying about; WFLLp2 is closer, but even it involves this slow-takeoff view (in the strong sense, in which economy is growing fast before the point of no return) which I’ve argued against. I do not think that the reason people shifted from “yudkowsky/bostrom” (which in this context seems to mean “single AI project builds AI in the wrong way, AI takes over world” and to WFLLp1 is that people rationally considered all the arguments and decided that WFLLp1 was on balance more likely. I think instead that probably some sort of optimism bias was involved, and more importantly win by default (Yud + Bostrom stopped talking about their scenarios and arguing for them, whereas Paul wrote a bunch of detailed posts laying out his scenarios and arguments, and so in the absence of visible counterarguments Paul wins the debate by default). Part of my feeling about this is that it’s a failure on my part; when Paul+Katja wrote their big post on takeoff speeds I disagreed with it and considered writing a big point-by-point response, but never did, even after various people posted questions asking “has there been any serious response to Paul+Katja?”
Re (a): I looked at chapters 4 and 5 of Superintelligence again, and I can kind of see what you mean, but I’m also frustrated that Bostrom seems really non-committal in the book. He lists a whole bunch of possibilities but then doesn’t seem to actually come out and give his mainline visualization/”median future”. For example he looks at historical examples of technology races and compares how much lag there was, which seems a lot like the kind of thinking you are doing, but then he also says things like “For example, if human-level AI is delayed because one key insight long eludes programmers, then when the final breakthrough occurs, the AI might leapfrog from below to radically above human level without even touching the intermediary rungs.” which sounds like the deep math view. Another relevant quote:
Building a seed AI might require insights and algorithms developed over many decades by the scientific community around the world. But it is possible that the last critical breakthrough idea might come from a single individual or a small group that succeeds in putting everything together. This scenario is less realistic for some AI architectures than others. A system that has a large number of parts that need to be tweaked and tuned to work effectively together, and then painstakingly loaded with custom-made cognitive content, is likely to require a larger project. But if a seed AI could be instantiated as a simple system, one whose construction depends only on getting a few basic principles right, then the feat might be within the reach of a small team or an individual. The likelihood of the final breakthrough being made by a small project increases if most previous progress in the field has been published in the open literature or made available as open source software.
Re (b): I don’t disagree with you here. (The only part that worries me is, I don’t have a good idea of what percentage of “AI safety people” shifted from one view to the other, whether were were also new people with different views coming in to the field, etc.) I realize the OP was mainly about failure scenarios, but it did also mention takeoffs (“takeoffs won’t be too fast”) and I was most curious about that part.
I also wish I knew what Bostrom’s median future was like, though I perhaps understand why he didn’t put it in his book—the incentives all push against it. Predicting the future is hard and people will hold it against you if you fail, whereas if you never try at all and instead say lots of vague prophecies, people will laud you as a visionary prophet.
Re (b) cool, I think we are on the same page then. Re takeoff being too fast—I think a lot of people these days think there will be plenty of big scary warning shots and fire alarms that motivate lots of people to care about AI risk and take it seriously. I think that suggests that a lot of people expect a fairly slow takeoff, slower than I think is warranted. Might happen, yes, but I don’t think Paul & Katja’s arguments are that convincing that takeoff will be this slow. It’s a big source of uncertainty for me though.
For all I know you are right about Yudkowsky’s pre-2011 view about deep math. However, (a) that wasn’t Bostrom’s view AFAICT, and (b) I think that’s just not what this OP quote is talking about. From the OP:
It’s Yudkowsky/Bostrom, not Yudkowsky. And it’s WFLLp1, not p2. Part 2 is the one where the AIs do a treacherous turn; part 1 is where actually everything is fine except that “you get what you measure” and our dumb obedient AIs are optimizing for the things we told them to optimize for rather than for what we want.
I am pretty confident that WFLLp1 is not the main thing we should be worrying about; WFLLp2 is closer, but even it involves this slow-takeoff view (in the strong sense, in which economy is growing fast before the point of no return) which I’ve argued against. I do not think that the reason people shifted from “yudkowsky/bostrom” (which in this context seems to mean “single AI project builds AI in the wrong way, AI takes over world” and to WFLLp1 is that people rationally considered all the arguments and decided that WFLLp1 was on balance more likely. I think instead that probably some sort of optimism bias was involved, and more importantly win by default (Yud + Bostrom stopped talking about their scenarios and arguing for them, whereas Paul wrote a bunch of detailed posts laying out his scenarios and arguments, and so in the absence of visible counterarguments Paul wins the debate by default). Part of my feeling about this is that it’s a failure on my part; when Paul+Katja wrote their big post on takeoff speeds I disagreed with it and considered writing a big point-by-point response, but never did, even after various people posted questions asking “has there been any serious response to Paul+Katja?”
Re (a): I looked at chapters 4 and 5 of Superintelligence again, and I can kind of see what you mean, but I’m also frustrated that Bostrom seems really non-committal in the book. He lists a whole bunch of possibilities but then doesn’t seem to actually come out and give his mainline visualization/”median future”. For example he looks at historical examples of technology races and compares how much lag there was, which seems a lot like the kind of thinking you are doing, but then he also says things like “For example, if human-level AI is delayed because one key insight long eludes programmers, then when the final breakthrough occurs, the AI might leapfrog from below to radically above human level without even touching the intermediary rungs.” which sounds like the deep math view. Another relevant quote:
Re (b): I don’t disagree with you here. (The only part that worries me is, I don’t have a good idea of what percentage of “AI safety people” shifted from one view to the other, whether were were also new people with different views coming in to the field, etc.) I realize the OP was mainly about failure scenarios, but it did also mention takeoffs (“takeoffs won’t be too fast”) and I was most curious about that part.
I also wish I knew what Bostrom’s median future was like, though I perhaps understand why he didn’t put it in his book—the incentives all push against it. Predicting the future is hard and people will hold it against you if you fail, whereas if you never try at all and instead say lots of vague prophecies, people will laud you as a visionary prophet.
Re (b) cool, I think we are on the same page then. Re takeoff being too fast—I think a lot of people these days think there will be plenty of big scary warning shots and fire alarms that motivate lots of people to care about AI risk and take it seriously. I think that suggests that a lot of people expect a fairly slow takeoff, slower than I think is warranted. Might happen, yes, but I don’t think Paul & Katja’s arguments are that convincing that takeoff will be this slow. It’s a big source of uncertainty for me though.