Trying to advance AI in hopes that your team will get all the way to AGI first, and in hopes that you’ll also somehow solve alignment at the same time. Backfires if your AI-advancing ideas leak, especially if you don’t actually manage to be first. Backfires worst in worlds where you came closest to succeeding by actually making tons of AI progress. Reminds me of “shooting the moon” in the card game Hearts that way.
Sometimes I wonder whether the whole current AI safety movement is net harmful via prompting many safety-concerned people and groups to attempt AI capabilities research with unlikely “shoot-the-moon”-style payoffs in mind.
Trying to advance AI in hopes that your team will get all the way to AGI first, and in hopes that you’ll also somehow solve alignment at the same time. Backfires if your AI-advancing ideas leak, especially if you don’t actually manage to be first. Backfires worst in worlds where you came closest to succeeding by actually making tons of AI progress. Reminds me of “shooting the moon” in the card game Hearts that way.
Sometimes I wonder whether the whole current AI safety movement is net harmful via prompting many safety-concerned people and groups to attempt AI capabilities research with unlikely “shoot-the-moon”-style payoffs in mind.