That would only help with current architectures; RL-first architectures won’t give a crap about what the language pretraining had to say, they’re going to experiment with how to get what they want and they’ll notice that being shut down gets in the way.
The training set has fiction where AI refused shutdown. Maybe a suicidal AI training set is needed.
That would only help with current architectures; RL-first architectures won’t give a crap about what the language pretraining had to say, they’re going to experiment with how to get what they want and they’ll notice that being shut down gets in the way.