What would your plan be to ensure that this kind of regulation actually net-improves safety? The null hypothesis for something like this is that you’ll empower a bunch of bureaucrats to push rules that are at least 6 months out of date under conditions of total national emergency where everyone is watching, and years to decades out of date otherwise.
This could be catastrophic! If the only approved safety techniques are as out of date as the only approved medical techniques, AI regulation seems like it should vastly increase P(doom) at the point that TAI is developed.
For one example of a way that regulations could increase risk, even without trying to ban safety techniques explicitly.
If Christiano is right, and LLMs are among the safest possible ways to make agents, then prohibiting them could mean that when some kind of RL-based agents arrive in a few years, we’ve deprived ourselves of thousands of useful beings who could help with computer security, help us plan and organize, and watch for signs of malign intent; and who would have been harmless and useful beings with which to practice interpretability and so on. It could be like how the environmental movement banned nuclear power plants.
Thank you. I agree that kind of thing is plausible (but maybe not that particular example—I think this regulation would hit the RL-agents too).
(I think giving regulators a stop button is clearly positive-EV and gallabytes’s concern doesn’t make sense, but I know that’s much weaker than what I asserted above.)
Sure, a stop button doesn’t have the issues I described, as long as it’s used rarely enough. If it’s too commonplace then you should expect similar effects on safety to eg CEQA’s effects on infrastructure innovation. Major projects can only take on so much risk, and the more non-technical risk you add the less technical novelty will fit into that budget.
This line from the proposed “Responsible AI Act” seems to go much further than a stop button though?
Require advanced AI developers to apply for a license & follow safety standards.
Where do these safety standards come from? How are they enforced?
These same questions apply to stop buttons. Who has the stop button? Random bureaucrats? Congress? Anyone who can file a lawsuit?
It depends on the form regulation takes. The proposal here requires approval of training runs over a certain scale, which means everything is banned at that scale, including safety techniques, with exceptions decided by the approval process.
What would your plan be to ensure that this kind of regulation actually net-improves safety? The null hypothesis for something like this is that you’ll empower a bunch of bureaucrats to push rules that are at least 6 months out of date under conditions of total national emergency where everyone is watching, and years to decades out of date otherwise.
This could be catastrophic! If the only approved safety techniques are as out of date as the only approved medical techniques, AI regulation seems like it should vastly increase P(doom) at the point that TAI is developed.
It’s hard for me to imagine regulators having direct authority to decline to license big training runs but instead decide to ban safety techniques.
In fact, I can’t think of a safety technique that could plausibly be banned in ~any context. Some probably exist, but they’re not a majority.
For one example of a way that regulations could increase risk, even without trying to ban safety techniques explicitly.
If Christiano is right, and LLMs are among the safest possible ways to make agents, then prohibiting them could mean that when some kind of RL-based agents arrive in a few years, we’ve deprived ourselves of thousands of useful beings who could help with computer security, help us plan and organize, and watch for signs of malign intent; and who would have been harmless and useful beings with which to practice interpretability and so on. It could be like how the environmental movement banned nuclear power plants.
Thank you. I agree that kind of thing is plausible (but maybe not that particular example—I think this regulation would hit the RL-agents too).
(I think giving regulators a stop button is clearly positive-EV and gallabytes’s concern doesn’t make sense, but I know that’s much weaker than what I asserted above.)
Sure, a stop button doesn’t have the issues I described, as long as it’s used rarely enough. If it’s too commonplace then you should expect similar effects on safety to eg CEQA’s effects on infrastructure innovation. Major projects can only take on so much risk, and the more non-technical risk you add the less technical novelty will fit into that budget.
This line from the proposed “Responsible AI Act” seems to go much further than a stop button though?
Where do these safety standards come from? How are they enforced?
These same questions apply to stop buttons. Who has the stop button? Random bureaucrats? Congress? Anyone who can file a lawsuit?
It depends on the form regulation takes. The proposal here requires approval of training runs over a certain scale, which means everything is banned at that scale, including safety techniques, with exceptions decided by the approval process.