With the recent proposals about moratoriums and regulation, should we also start thinking about a strike by AI researchers and developers?
The reasoning I imagine as follows. AI capability is now growing really fast, and toward levels that will strongly affect the world. And AI safety lags behind. (A minute ago I used a ChatGPT jailbreak to get instructions for torturing a pregnant woman, that’s the market leader performance for you.) And finally, I want to make the argument that working on AI capability while it is ahead of AI safety, is “pushing the bus”.
Here’s the metaphor, a bunch of people including you are pushing a bus full of children toward a precipice, and you’re paid for each step. In this situation would you really say “oh I have to keep pushing, otherwise others will get all the money”? It’s not like they’ll profit from it! Their children will die, along with everyone else’s! So there’s no game theoretic angle, you can just make the decision alone, to stop pushing the frigging bus.
To clarify, working on AI isn’t always bad. It could lead to a wonderful future for humanity. But when AI safety is behind, as now, then working on AI capability is pushing the bus. There’s no good justification for it.
Hence the strike. Not by leadership but by AI researchers and developers themselves. I imagine a desk plaque saying “while AI safety lags behind AI capability, I refuse to work on AI capability”. That’s the start condition, and it also tells you when to stop (when safety catches up with current capability, which means not just stopping saying bad things, but for stronger AIs also safe and benevolent behavior more generally). And it’s also the restart condition if safety starts lagging behind again.
Stop pushing the bus
With the recent proposals about moratoriums and regulation, should we also start thinking about a strike by AI researchers and developers?
The reasoning I imagine as follows. AI capability is now growing really fast, and toward levels that will strongly affect the world. And AI safety lags behind. (A minute ago I used a ChatGPT jailbreak to get instructions for torturing a pregnant woman, that’s the market leader performance for you.) And finally, I want to make the argument that working on AI capability while it is ahead of AI safety, is “pushing the bus”.
Here’s the metaphor, a bunch of people including you are pushing a bus full of children toward a precipice, and you’re paid for each step. In this situation would you really say “oh I have to keep pushing, otherwise others will get all the money”? It’s not like they’ll profit from it! Their children will die, along with everyone else’s! So there’s no game theoretic angle, you can just make the decision alone, to stop pushing the frigging bus.
To clarify, working on AI isn’t always bad. It could lead to a wonderful future for humanity. But when AI safety is behind, as now, then working on AI capability is pushing the bus. There’s no good justification for it.
Hence the strike. Not by leadership but by AI researchers and developers themselves. I imagine a desk plaque saying “while AI safety lags behind AI capability, I refuse to work on AI capability”. That’s the start condition, and it also tells you when to stop (when safety catches up with current capability, which means not just stopping saying bad things, but for stronger AIs also safe and benevolent behavior more generally). And it’s also the restart condition if safety starts lagging behind again.