What about this alternate twist: safety cases are the right model, but it just happens to be extremely difficult to make an adequate safety case for competent agentic AGI (or anything close).
Introducing the safety case model for near-future AI releases could normalize that. It should be pretty easy to make a safety case for GPT4 and Claude 3.5. When people want to deploy real AGI, they won’t be able to make the safety case without real advances in alignment. And that’s the point.
My guess is that it’s infeasible to ask for them to delay til they have a real safety case, due to insufficient coordination (including perhaps international competition).
Isn’t that an argument against almost any regulation? The bar on “safety case” can be adjusted up or down, and for better or worse will be.
I think real safety cases are currently very easy: sure, a couple of eggs might be broken by making an information and idea search somewhat better than google. Some jackass will use it for ill, and thus cause slightly more harm than they would’ve with Google. But the increase in risk of major harms is tiny compared to the massive benefits of giving everyone something like a competent assistant who’s near-expert in almost every domain.
Maybe we’re too hung up on downsides as a society to make this an acceptable safety case. Our societal and legal risk-aversion might make safety cases a nonstarter, even though they seem likely to be the correct model if we could collectively think even close to clearly.
What about this alternate twist: safety cases are the right model, but it just happens to be extremely difficult to make an adequate safety case for competent agentic AGI (or anything close).
Introducing the safety case model for near-future AI releases could normalize that. It should be pretty easy to make a safety case for GPT4 and Claude 3.5. When people want to deploy real AGI, they won’t be able to make the safety case without real advances in alignment. And that’s the point.
My guess is that it’s infeasible to ask for them to delay til they have a real safety case, due to insufficient coordination (including perhaps international competition).
Isn’t that an argument against almost any regulation? The bar on “safety case” can be adjusted up or down, and for better or worse will be.
I think real safety cases are currently very easy: sure, a couple of eggs might be broken by making an information and idea search somewhat better than google. Some jackass will use it for ill, and thus cause slightly more harm than they would’ve with Google. But the increase in risk of major harms is tiny compared to the massive benefits of giving everyone something like a competent assistant who’s near-expert in almost every domain.
Maybe we’re too hung up on downsides as a society to make this an acceptable safety case. Our societal and legal risk-aversion might make safety cases a nonstarter, even though they seem likely to be the correct model if we could collectively think even close to clearly.