Number 2, I would create a set of safety standards focused on what you said in your third hypothesis as the dangerous capability evaluations. One example that we’ve used in the past is looking to see if a model can self-replicate and self-exfiltrate into the wild. We can give your office a long other list on the things that we think are important there, but specific tests that a model has to pass before it can be deployed into the world.
I think the “dangerous capability evaluations” standard makes sense in the current policy environment.
Tons of people in policy can easily understand and agree that there’s clear thresholds in AI that are really, really serious problems that shouldn’t be touched with a ten-foot-pole, and “things like self-replication” is a good way to put it.
Senator Kennedy (R-LA): “Are there people out there that would be qualified [to administer the rules if they were implemented]?”
Sam Altman: “We would be happy to send you recommendations for people out there.”
The other thing I agreed with was revolving door recommendations. There are parts of government that are actually surprisingly competent (especially compared to the average government office), and largely their secret for doing that is by recruiting top talent from the private sector, to work as advisors for a year or so and then return to their natural environment. The classic government officials stay in charge, but they have no domain expertise, and often zero interest in learning any, so they basically defer to the expert advisors and do nothing except handle the office politics so that the expert advisors don’t have to (which is their area of expertise anyway). It’s generally more complicated than that but this is the basic dynamic.
It kinda sucks that OpenAI gets to be the one to do that, as a reward for defecting and being the first to accelerate, as opposed to ARC or Redwood or MIRI who credibly committed to avoid accelerating. But it’s probably more complicated than that. DM me if you’re interested and want to talk about this.
I think the “dangerous capability evaluations” standard makes sense in the current policy environment.
Tons of people in policy can easily understand and agree that there’s clear thresholds in AI that are really, really serious problems that shouldn’t be touched with a ten-foot-pole, and “things like self-replication” is a good way to put it.
The other thing I agreed with was revolving door recommendations. There are parts of government that are actually surprisingly competent (especially compared to the average government office), and largely their secret for doing that is by recruiting top talent from the private sector, to work as advisors for a year or so and then return to their natural environment. The classic government officials stay in charge, but they have no domain expertise, and often zero interest in learning any, so they basically defer to the expert advisors and do nothing except handle the office politics so that the expert advisors don’t have to (which is their area of expertise anyway). It’s generally more complicated than that but this is the basic dynamic.
It kinda sucks that OpenAI gets to be the one to do that, as a reward for defecting and being the first to accelerate, as opposed to ARC or Redwood or MIRI who credibly committed to avoid accelerating. But it’s probably more complicated than that. DM me if you’re interested and want to talk about this.