I’ve seen a bunch of places where them people in the AI Optimism cluster dismiss arguments that use evolution as an analogy (for anything?) because they consider it debunked by Evolution provides no evidence for the sharp left turn. I think many people (including myself) think that piece didn’t at all fully debunk the use of evolution arguments when discussing misalignment risk. A people have written what I think are good responses to that piece; many of the comments, especially this one, and someposts.
I don’t really know what to do here. The arguments often look like: A: “Here’s an evolution analogy which I think backs up my claims.” B: “I think the evolution analogy has been debunked and I don’t consider your argument to be valid.” A: “I disagree that the analogy has been debunked, and think evolutionary analogies are valid and useful”.
The AI Optimists seem reasonably unwilling to rehash the evolution analogy argument, because they consider this settled (I hope I’m not being uncharitable here). I think this is often a reasonable move, like I’m not particularly interested in arguing about climate change or flat-earth because I do consider these settled. But I do think that the evolution analogy argument is not settled.
One might think that the obvious move here is to go to the object-level. But this would just be attempting to rehash the evolution analogy argument again; a thing that the AI Optimists seem (maybe reasonably) unwilling to do.
Is there a place that you think canonically sets forth the evolution analogy and why it concludes what it concludes in a single document? Like, a place that is legible and predictive, and with which you’re satisfied as self-contained—at least speaking for yourself, if not for others?
Are evolution analogies really that much of a crux? It seems like the evidence from evolution can’t get us that far in an absolute sense (though I could imagine evolution updating someone up to a moderate probability from a super low prior?), so we should be able to talk about more object level things regardless.
Yeah, I agree with this, we should be able to usually talk about object level things.
Although (as you note in your other comment) evolution is useful for thinking about inner optimizers, deceptive alignment etc. I think that thinking about “optimizers” (what things create optimizers? what will the optimizers do? etc) is pretty hard, and at least I think it’s useful to be able to look at the one concrete case where some process created a generally competent optimizer
people have written what I think are good responses to that piece; many of the comments, especially this one, and some posts.
There are responses by Quintin Pope and Ryan Greenblatt that addressed their points, where Ryan Greenblatt pointed out that the argument used in support of autonomous learning is only distinguishable from supervised learning if there are data limitations, and we can tell an analogous story about supervised learning having a fast takeoff without data limitations, and Quintin Pope has massive comments that I can’t really summarize, but one is a general purpose response to Zvi’s post, and the other is adding context to the debate between Quintin Pope and Jan Kulevit on culture:
I think evolution clearly provides some evidence for things like inner optimizers, deceptive alignment, and “AI takeoff which starts with ML and human understandable engineering (e.g. scaffolding/prompting), but where different mechansms drive further growth prior to full human obsolescence”[1].
Personally, I’m quite sympathetic overall to Zvi’s response post (which you link) and I had many of the same objections. I guess further litigation of this post (and the response in the comments) might be the way to go if you want to go down that road?
I overall tend to be pretty sympathetic to many objections to hard takeoff, “sharp left turn” concerns, and high probability on high levels of difficulty in safely navigating powerful AI. But, I still think that the “AI optimism” cluster is too dismissive of the case for despair and over confident in the case for hope. And a bunch of this argument has maybe already occured and doesn’t seem to have gotten very far. (Though the exact objections I would say to the AI optimist people are moderately different than most of what I’ve seen so far.) So, I’d be pretty sympathetic to just not trying to target them as an audience.
Note that key audiences for doom arguments are often like “somewhat sympathetic people at AI labs” and “somewhat sympathetic researchers or grantmakers who already have some probability on the threat models you outline”.
This is perhaps related to the “the sharp left turn”, but I think the “sharp left turn” concept is poorly specified and might conflate a bunch of separate (though likely correlated) things. Thus, I prefer being more precise.
I’ve seen a bunch of places where them people in the AI Optimism cluster dismiss arguments that use evolution as an analogy (for anything?) because they consider it debunked by Evolution provides no evidence for the sharp left turn. I think many people (including myself) think that piece didn’t at all fully debunk the use of evolution arguments when discussing misalignment risk. A people have written what I think are good responses to that piece; many of the comments, especially this one, and some posts.
I don’t really know what to do here. The arguments often look like:
A: “Here’s an evolution analogy which I think backs up my claims.”
B: “I think the evolution analogy has been debunked and I don’t consider your argument to be valid.”
A: “I disagree that the analogy has been debunked, and think evolutionary analogies are valid and useful”.
The AI Optimists seem reasonably unwilling to rehash the evolution analogy argument, because they consider this settled (I hope I’m not being uncharitable here). I think this is often a reasonable move, like I’m not particularly interested in arguing about climate change or flat-earth because I do consider these settled. But I do think that the evolution analogy argument is not settled.
One might think that the obvious move here is to go to the object-level. But this would just be attempting to rehash the evolution analogy argument again; a thing that the AI Optimists seem (maybe reasonably) unwilling to do.
Is there a place that you think canonically sets forth the evolution analogy and why it concludes what it concludes in a single document? Like, a place that is legible and predictive, and with which you’re satisfied as self-contained—at least speaking for yourself, if not for others?
I would like to read such a piece.
Are evolution analogies really that much of a crux? It seems like the evidence from evolution can’t get us that far in an absolute sense (though I could imagine evolution updating someone up to a moderate probability from a super low prior?), so we should be able to talk about more object level things regardless.
Yeah, I agree with this, we should be able to usually talk about object level things.
Although (as you note in your other comment) evolution is useful for thinking about inner optimizers, deceptive alignment etc. I think that thinking about “optimizers” (what things create optimizers? what will the optimizers do? etc) is pretty hard, and at least I think it’s useful to be able to look at the one concrete case where some process created a generally competent optimizer
There are responses by Quintin Pope and Ryan Greenblatt that addressed their points, where Ryan Greenblatt pointed out that the argument used in support of autonomous learning is only distinguishable from supervised learning if there are data limitations, and we can tell an analogous story about supervised learning having a fast takeoff without data limitations, and Quintin Pope has massive comments that I can’t really summarize, but one is a general purpose response to Zvi’s post, and the other is adding context to the debate between Quintin Pope and Jan Kulevit on culture:
https://www.lesswrong.com/posts/hvz9qjWyv8cLX9JJR/evolution-provides-no-evidence-for-the-sharp-left-turn#hkqk6sFphuSHSHxE4
https://www.lesswrong.com/posts/Wr7N9ji36EvvvrqJK/response-to-quintin-pope-s-evolution-provides-no-evidence#PS84seDQqnxHnKy8i
https://www.lesswrong.com/posts/wCtegGaWxttfKZsfx/we-don-t-understand-what-happened-with-culture-enough#YaE9uD398AkKnWWjz
I think evolution clearly provides some evidence for things like inner optimizers, deceptive alignment, and “AI takeoff which starts with ML and human understandable engineering (e.g. scaffolding/prompting), but where different mechansms drive further growth prior to full human obsolescence”[1].
Personally, I’m quite sympathetic overall to Zvi’s response post (which you link) and I had many of the same objections. I guess further litigation of this post (and the response in the comments) might be the way to go if you want to go down that road?
I overall tend to be pretty sympathetic to many objections to hard takeoff, “sharp left turn” concerns, and high probability on high levels of difficulty in safely navigating powerful AI. But, I still think that the “AI optimism” cluster is too dismissive of the case for despair and over confident in the case for hope. And a bunch of this argument has maybe already occured and doesn’t seem to have gotten very far. (Though the exact objections I would say to the AI optimist people are moderately different than most of what I’ve seen so far.) So, I’d be pretty sympathetic to just not trying to target them as an audience.
Note that key audiences for doom arguments are often like “somewhat sympathetic people at AI labs” and “somewhat sympathetic researchers or grantmakers who already have some probability on the threat models you outline”.
This is perhaps related to the “the sharp left turn”, but I think the “sharp left turn” concept is poorly specified and might conflate a bunch of separate (though likely correlated) things. Thus, I prefer being more precise.