About a month ago, I wrote a quick take suggesting that an early messaging mistake made by MIRI was: claim there should be a single leading FAI org, but not give specific criteria for selecting that org. That could’ve lead to a situation where Deepmind, OpenAI, and Anthropic can all think of themselves as “the best leading FAI org”.
An analogous possible mistake that’s currently being made: Claim that we should “shut it all down”, and also claim that it would be a tragedy if humanity never created AI, but not give specific criteria for when it would be appropriate to actually create AI.
What sort of specific criteria? One idea: A committee of random alignment researchers is formed to study the design; if at least X% of the committee rates the odds of success at Y% or higher, it gets the thumbs up. Not ideal criteria, just provided for the sake of illustration.
Why would this be valuable?
If we actually get a pause, it’s important to know when to unpause as well. Specific criteria could improve the odds that an unpause happens in a reasonable way.
If you want to build consensus for a pause, advertising some reasonable criteria for when we’ll unpause could get more people on board.
I also have the sense that the time to talk about unpausing is while creating the pause; this is why I generally am in favor of things like RSPs and RDPs. (I think others think that this is a bit premature / too easy to capture, and we are more likely to get a real pause by targeting a halt.)
Regarding the situation at OpenAI, I think it’s important to keep a few historical facts in mind:
The AI alignment community has long stated that an ideal FAI project would have a lead over competing projects. See e.g. this post:
Requisite resource levels: The project must have adequate resources to compete at the frontier of AGI development, including whatever mix of computational resources, intellectual labor, and closed insights are required to produce a 1+ year lead over less cautious competing projects.
The scaling hypothesis wasn’t obviously true around the time OpenAI was founded. At that time, it was assumed that regulation was ineffectual because algorithms can’t be regulated. It’s only now, when GPUs are looking like the bottleneck, that the regulation strategy seems viable.
What happened with OpenAI? One story is something like:
AI safety advocates attracted a lot of attention in Silicon Valley with a particular story about AI dangers and what needed to be done.
Part of this story involved an FAI project with a lead over competing projects. But the story didn’t come with easy-to-evaluate criteria for whether a leading project counted as a good “FAI project” or an bad “UFAI project”. Thinking about AI alignment is epistemically cursed; people who think about the topic independently rarely have similar models.
OpenAI hired employees with a distribution of beliefs about AI alignment difficulty, some of whom may be motivated primarily by greed or power-seeking.
At a certain point, that distribution got “truncated” with the formation of Anthropic.
Presumably at this point, every major project thinks it’s best if they win, due to self-serving biases.
Some possible lessons:
Do more message red-teaming. If an organization like AI Lab Watch had been founded 10+ years ago, and was baked into the AI safety messaging along with “FAI project needs a clear lead”, then we could’ve spent the past 10 years getting consensus on how to anoint one or a just a few “FAI projects”. And the campaign for AI Pause could instead be a campaign to “pause all AGI projects except the anointed FAI project”. So—when we look back in 10 years on the current messaging, what mistakes will seem obvious in hindsight? And if this situation is partially a result of MIRI’s messaging in the past, perhaps we should ask hard questions about their current pivot towards messaging? (Note: I could be accused of grinding my personal axe here, because I’m rather dissatisfied with current AI Pause messaging.)
Assume AI acts like magnet for greedy power-seekers. Make decisions accordingly.
A Bird-Flu Pandemic in People? Here’s What It Might Look Like. TLDR: not good. (Reload the page and ctrl-a then ctrl-c to copy the article text before the paywall comes up.) Interesting quote: “The real danger, Dr. Lowen of Emory said, is if a farmworker becomes infected with both H5N1 and a seasonal flu virus. Flu viruses are adept at swapping genes, so a co-infection would give H5N1 opportunity to gain genes that enable it to spread among people as efficiently as seasonal flu does.”
The FDA is making reassuring noises about pasteurized milk, but given that CDC and friends also made reassuring noises early in the COVID-19 pandemic, I’m not fully reassured.
I wonder if drinking a little bit of pasteurized milk every day would be helpful inoculation? You could hedge your bets by buying some milk from every available brand, and consuming a teaspoon from a different brand every day, gradually working up to a tablespoon etc.
About a month ago, I wrote a quick take suggesting that an early messaging mistake made by MIRI was: claim there should be a single leading FAI org, but not give specific criteria for selecting that org. That could’ve lead to a situation where Deepmind, OpenAI, and Anthropic can all think of themselves as “the best leading FAI org”.
An analogous possible mistake that’s currently being made: Claim that we should “shut it all down”, and also claim that it would be a tragedy if humanity never created AI, but not give specific criteria for when it would be appropriate to actually create AI.
What sort of specific criteria? One idea: A committee of random alignment researchers is formed to study the design; if at least X% of the committee rates the odds of success at Y% or higher, it gets the thumbs up. Not ideal criteria, just provided for the sake of illustration.
Why would this be valuable?
If we actually get a pause, it’s important to know when to unpause as well. Specific criteria could improve the odds that an unpause happens in a reasonable way.
If you want to build consensus for a pause, advertising some reasonable criteria for when we’ll unpause could get more people on board.
I think Six Dimensions of Operational Adequacy was in this direction; I wish we had been more willing to, like, issue scorecards earlier (like publishing that document in 2017 instead of 2022). The most recent scorecard-ish thing was commentary on the AI Safety Summit responses.
I also have the sense that the time to talk about unpausing is while creating the pause; this is why I generally am in favor of things like RSPs and RDPs. (I think others think that this is a bit premature / too easy to capture, and we are more likely to get a real pause by targeting a halt.)
Regarding the situation at OpenAI, I think it’s important to keep a few historical facts in mind:
The AI alignment community has long stated that an ideal FAI project would have a lead over competing projects. See e.g. this post:
The scaling hypothesis wasn’t obviously true around the time OpenAI was founded. At that time, it was assumed that regulation was ineffectual because algorithms can’t be regulated. It’s only now, when GPUs are looking like the bottleneck, that the regulation strategy seems viable.
What happened with OpenAI? One story is something like:
AI safety advocates attracted a lot of attention in Silicon Valley with a particular story about AI dangers and what needed to be done.
Part of this story involved an FAI project with a lead over competing projects. But the story didn’t come with easy-to-evaluate criteria for whether a leading project counted as a good “FAI project” or an bad “UFAI project”. Thinking about AI alignment is epistemically cursed; people who think about the topic independently rarely have similar models.
Deepmind was originally the consensus “FAI project”, but Elon Musk started OpenAI because Larry Page has e/acc beliefs.
OpenAI hired employees with a distribution of beliefs about AI alignment difficulty, some of whom may be motivated primarily by greed or power-seeking.
At a certain point, that distribution got “truncated” with the formation of Anthropic.
Presumably at this point, every major project thinks it’s best if they win, due to self-serving biases.
Some possible lessons:
Do more message red-teaming. If an organization like AI Lab Watch had been founded 10+ years ago, and was baked into the AI safety messaging along with “FAI project needs a clear lead”, then we could’ve spent the past 10 years getting consensus on how to anoint one or a just a few “FAI projects”. And the campaign for AI Pause could instead be a campaign to “pause all AGI projects except the anointed FAI project”. So—when we look back in 10 years on the current messaging, what mistakes will seem obvious in hindsight? And if this situation is partially a result of MIRI’s messaging in the past, perhaps we should ask hard questions about their current pivot towards messaging? (Note: I could be accused of grinding my personal axe here, because I’m rather dissatisfied with current AI Pause messaging.)
Assume AI acts like magnet for greedy power-seekers. Make decisions accordingly.
Some recent-ish bird flu coverage:
Global health leader critiques ‘ineptitude’ of U.S. response to bird flu outbreak among cows
A Bird-Flu Pandemic in People? Here’s What It Might Look Like. TLDR: not good. (Reload the page and ctrl-a then ctrl-c to copy the article text before the paywall comes up.) Interesting quote: “The real danger, Dr. Lowen of Emory said, is if a farmworker becomes infected with both H5N1 and a seasonal flu virus. Flu viruses are adept at swapping genes, so a co-infection would give H5N1 opportunity to gain genes that enable it to spread among people as efficiently as seasonal flu does.”
Infectious bird flu survived milk pasteurization in lab tests, study finds. Here’s what to know.
1 in 5 milk samples from grocery stores test positive for bird flu. Why the FDA says it’s still safe to drink—see also updates from the FDA here: “Last week we announced preliminary results of a study of 297 retail dairy samples, which were all found to be negative for viable virus.” (May 10)
The FDA is making reassuring noises about pasteurized milk, but given that CDC and friends also made reassuring noises early in the COVID-19 pandemic, I’m not fully reassured.
I wonder if drinking a little bit of pasteurized milk every day would be helpful inoculation? You could hedge your bets by buying some milk from every available brand, and consuming a teaspoon from a different brand every day, gradually working up to a tablespoon etc.