I generally agree that this is a future bottleneck, and similarly the Cotra bio-anchors report emphasizes “how long are training episodes” as arguably the second most important dominating factor of timelines (after “do you believe in bio-anchors”).
I tend to expect both (1) and (2) to partially hold. In particular, while humans/humanity are not fully observable, there are a lot of approximations that make it kind of observable. But I do generally think that “do a bunch of sociology/microeconomics/Anders-Sandberg-style futurism research” is an often-unmentioned component of any deceptive FOOM scenario.
That said, “make copies of yourself to avoid shut off” is a convergent instrumental goal that shows up even in fairly short episode lengths, and the solution (hack your way out into the internet and then figure out your next steps) is plausibly implementable in 15 minutes. I think this is a better counterargument for like, 3-minute episodes, which seems like self-driving car range.
I generally agree that this is a future bottleneck, and similarly the Cotra bio-anchors report emphasizes “how long are training episodes” as arguably the second most important dominating factor of timelines (after “do you believe in bio-anchors”).
I tend to expect both (1) and (2) to partially hold. In particular, while humans/humanity are not fully observable, there are a lot of approximations that make it kind of observable. But I do generally think that “do a bunch of sociology/microeconomics/Anders-Sandberg-style futurism research” is an often-unmentioned component of any deceptive FOOM scenario.
That said, “make copies of yourself to avoid shut off” is a convergent instrumental goal that shows up even in fairly short episode lengths, and the solution (hack your way out into the internet and then figure out your next steps) is plausibly implementable in 15 minutes. I think this is a better counterargument for like, 3-minute episodes, which seems like self-driving car range.
I agree with all of this.