And the best part of this is that instead of telling people who might already be working in some benign branch of ML that there’s this huge problem with AGI, who can potentially defect and go into that branch because it sounds cool, you’re already talking to people who, from your perspective, are doing the worst thing in the world. There’s no failure mode where some psychopaths are going to go be intrigued by the “power” of turning the world into paperclips. They’re already working at DeepMind or OpenAI. Personally, I think that failure mode is overblown, but this is one way you get around it.
If this is saying that there’s no plausible downside, that statement seems incorrect. It’s not a very important bit whether or not someone has a narrative of “working on AGI”. It takes 2 minutes to put that on a resume, and it would even be defensible if you’d ever done anything with a neural net. More important is the organizing principles of their technical explorations. There’s a whole space of possible organizing principles. If you’re publicizing your AI capabilities insights, the organizing principle of “tweak and mix algorithms to get gains on benchmarks” is less burning the AGI fuse than the organizing principle of “try new algorithms” is less burning the AGI fuse than the organizing principle of “try to make AGI”. To argue that AGI kills you by default, you argue that there are algorithmic inventions to be found that generalize across domains to enable large capability jumps, without having a controllable / understood relationship to goals. Which a fortiori says something about the power of generalization. Which might change how people organize their research. If “X is possible” can be the main insight of a research breakthrough, then “AGI is dangerous” contains “AGI is powerful” contains “power is possible”, and could have similar effects.
On another note:
There’s other downsides here. I don’t think that means this class of strategy should be taboo, and I think this class of strategy absolutely should be worked on. But pursuing a strategy without noticing the downsides, and noticing the ways the strategy is doomed to not actually help at all, is pretty crucial.
Downside: depending on implementation, you turn yourself into an ideological repeater. (This means you probably end up talking to people selected for, at least while they’re talking to you, being themselves ideological repeaters, which makes your strategy useless.) So you cause top AGI researchers to be more in an environment filled with ideological repeaters. So you cause it to be correct for top AGI researchers to model people around them as not having coherent perspectives, but instead to be engaging in ideological power struggles. If top AGI researchers are treating people that way, that bodes poorly for mind-changing communication.
If this is saying that there’s no plausible downside, that statement seems incorrect. It’s not a very important bit whether or not someone has a narrative of “working on AGI”. It takes 2 minutes to put that on a resume, and it would even be defensible if you’d ever done anything with a neural net. More important is the organizing principles of their technical explorations. There’s a whole space of possible organizing principles. If you’re publicizing your AI capabilities insights, the organizing principle of “tweak and mix algorithms to get gains on benchmarks” is less burning the AGI fuse than the organizing principle of “try new algorithms” is less burning the AGI fuse than the organizing principle of “try to make AGI”. To argue that AGI kills you by default, you argue that there are algorithmic inventions to be found that generalize across domains to enable large capability jumps, without having a controllable / understood relationship to goals. Which a fortiori says something about the power of generalization. Which might change how people organize their research. If “X is possible” can be the main insight of a research breakthrough, then “AGI is dangerous” contains “AGI is powerful” contains “power is possible”, and could have similar effects.
On another note:
There’s other downsides here. I don’t think that means this class of strategy should be taboo, and I think this class of strategy absolutely should be worked on. But pursuing a strategy without noticing the downsides, and noticing the ways the strategy is doomed to not actually help at all, is pretty crucial.
Downside: depending on implementation, you turn yourself into an ideological repeater. (This means you probably end up talking to people selected for, at least while they’re talking to you, being themselves ideological repeaters, which makes your strategy useless.) So you cause top AGI researchers to be more in an environment filled with ideological repeaters. So you cause it to be correct for top AGI researchers to model people around them as not having coherent perspectives, but instead to be engaging in ideological power struggles. If top AGI researchers are treating people that way, that bodes poorly for mind-changing communication.