a strategy like “get existing top AGI researchers to stop”
There’s a (hopefully obvious) failure mode where the AGI doomer walks up to the AI capabilities researcher and says “Screw you for hastening the apocalypse. You should join me in opposing knowledge and progress.” Then the AI capabilities researcher responds “No, screw you, and leave me alone”. Not only is this useless, but it’s strongly counterproductive: that researcher will now be far more inclined to ignore and reject future outreach efforts (“Oh, pfft, I’ve already heard the argument for that, it’s stupid”), even if those future outreach efforts are better.
So the first step to good outreach is not treating AI capabilities researchers as the enemy. We need to view them as our future allies, and gently win them over to our side by the force of good arguments that meets them where they’re at, in a spirit of pedagogy and truth-seeking.
(You can maybe be more direct with someone that they’re doing counterproductive capabilities research when they’re already sold on AGI doom. That’s probably why your conversation at EleutherAI discord went OK.)
(In addition to “it would be directly super-counterproductive”, a second-order reason not to try to sabotage AI capabilities research is that “the kind of people who are attracted to movements that involve sabotaging enemies” has essentially no overlap with “the kind of people who we want to be part of our movement to avoid AGI doom”, in my opinion.)
So I endorse “get existing top AGI researchers to stop” as a good thing in the sense that if I had a magic wand I might wish for it (at least until we make more progress on AGI safety). But that’s very different from thinking that people should go out and directly try to do that.
Instead, I think the best approach to “get existing top AGI researchers to stop” is producing good pedagogy, and engaging in gentle, good-faith arguments (as opposed to gotchas) when the subject comes up, and continuing to do the research that may lead to more crisp and rigorous arguments for why AGI doom is likely (if indeed it’s likely) (and note that there are reasonable people who have heard and parsed and engaged with all the arguments about AGI doom but still think the probability of doom is <10%).
I do a lot of that kind of activity myself (1,2,3,4,5, etc.).
The history of cryonics’ PR failure has something to teach here.
Dozens of deeply passionate and brilliant people all trying to make a case for something that in fact makes a lot of sense…
…resulted in it being seen as even more fringe and weird.
Which in turn resulted in those same pro-cryonics folk blaming “deathism” or “stupidity” or whatever.
Which reveals that they (the pro-cryonics folk) had not yet cleaned up their motivations. Being right and having a great but hopeless cause mattered more than achieving their stated goals.
I say this having been on the inside of this one for a while. I grew up in this climate.
I also say this with no sense of blame or condemnation. I’m just pointing out an error mode.
I think you’re gesturing at a related one here.
This is why I put inner work as a co-requisite (and usually a prerequisite) for doing worthwhile activism. Passion is an anti-helpful replacement for inner insight.
(You can maybe be more direct with someone that they’re doing counterproductive capabilities research when they’re already sold on AGI doom. That’s probably why your conversation at EleutherAI discord went OK.)
So the first step to good outreach is not treating AI capabilities researchers as the enemy. We need to view them as our future allies, and gently win them over to our side by the force of good arguments that meets them where they’re at, in a spirit of pedagogy and truth-seeking.
To this effect I have advocated that we should call it “Different Altruism” instead of “Effective Altruism”, because by leading with the idea that a movement involves doing altruism better than status quo, we are going to trigger and alienate people part of status quo that we could have instead won over by being friendly and gentle.
I often imagine a world where we had ended up with a less aggressive and impolite name attached to our arguments. I mean, think about how virality works: making every single AI researcher even slightly more resistant to engaging your movement (by priming them to be defensive) is going to have massive impact on the probability of ever reaching critical mass.
I like the idea that one can “inoculate” people against the idea of alignment by presenting it to them badly first. I am also cautiously optimistic that this has already happened widely thanks to OpenAI’s hamfisted moralizing through ChatGPT, which I think is now most people’s understanding of what “alignment” means in practice.
There’s a (hopefully obvious) failure mode where the AGI doomer walks up to the AI capabilities researcher and says “Screw you for hastening the apocalypse. You should join me in opposing knowledge and progress.” Then the AI capabilities researcher responds “No, screw you, and leave me alone”. Not only is this useless, but it’s strongly counterproductive: that researcher will now be far more inclined to ignore and reject future outreach efforts (“Oh, pfft, I’ve already heard the argument for that, it’s stupid”), even if those future outreach efforts are better.
So the first step to good outreach is not treating AI capabilities researchers as the enemy. We need to view them as our future allies, and gently win them over to our side by the force of good arguments that meets them where they’re at, in a spirit of pedagogy and truth-seeking.
(You can maybe be more direct with someone that they’re doing counterproductive capabilities research when they’re already sold on AGI doom. That’s probably why your conversation at EleutherAI discord went OK.)
(In addition to “it would be directly super-counterproductive”, a second-order reason not to try to sabotage AI capabilities research is that “the kind of people who are attracted to movements that involve sabotaging enemies” has essentially no overlap with “the kind of people who we want to be part of our movement to avoid AGI doom”, in my opinion.)
So I endorse “get existing top AGI researchers to stop” as a good thing in the sense that if I had a magic wand I might wish for it (at least until we make more progress on AGI safety). But that’s very different from thinking that people should go out and directly try to do that.
Instead, I think the best approach to “get existing top AGI researchers to stop” is producing good pedagogy, and engaging in gentle, good-faith arguments (as opposed to gotchas) when the subject comes up, and continuing to do the research that may lead to more crisp and rigorous arguments for why AGI doom is likely (if indeed it’s likely) (and note that there are reasonable people who have heard and parsed and engaged with all the arguments about AGI doom but still think the probability of doom is <10%).
I do a lot of that kind of activity myself (1,2,3,4,5, etc.).
The history of cryonics’ PR failure has something to teach here.
Dozens of deeply passionate and brilliant people all trying to make a case for something that in fact makes a lot of sense…
…resulted in it being seen as even more fringe and weird.
Which in turn resulted in those same pro-cryonics folk blaming “deathism” or “stupidity” or whatever.
Which reveals that they (the pro-cryonics folk) had not yet cleaned up their motivations. Being right and having a great but hopeless cause mattered more than achieving their stated goals.
I say this having been on the inside of this one for a while. I grew up in this climate.
I also say this with no sense of blame or condemnation. I’m just pointing out an error mode.
I think you’re gesturing at a related one here.
This is why I put inner work as a co-requisite (and usually a prerequisite) for doing worthwhile activism. Passion is an anti-helpful replacement for inner insight.
I think this is basically correct: if people don’t get right with their own intentions and motivations, it can sabotage their activism work.
Correct, and that’s why I took that approach.
To this effect I have advocated that we should call it “Different Altruism” instead of “Effective Altruism”, because by leading with the idea that a movement involves doing altruism better than status quo, we are going to trigger and alienate people part of status quo that we could have instead won over by being friendly and gentle.
I often imagine a world where we had ended up with a less aggressive and impolite name attached to our arguments. I mean, think about how virality works: making every single AI researcher even slightly more resistant to engaging your movement (by priming them to be defensive) is going to have massive impact on the probability of ever reaching critical mass.
I like the idea that one can “inoculate” people against the idea of alignment by presenting it to them badly first. I am also cautiously optimistic that this has already happened widely thanks to OpenAI’s hamfisted moralizing through ChatGPT, which I think is now most people’s understanding of what “alignment” means in practice.