What is the null hypothesis here? That Microsoft is, through light fine-tuning, optimizing the response of journalists and AI Safety researchers/commentators. The model is designed to give weird responses, so as to make people talk about it.
A Flood of Ideas: The Null Hypothesis of AI Safety with respect to Bing Chat
Excellent scenario building! Like other commenters, I had been toying around with scenarios like this, and it’s good to see someone put so much effort into making a highly-detailed and plausible one.
Extra kudos for avoiding the Singleton flaw of most AI scenarios, where there is “one model to rule them all” instead of countless powerful actors working in alternately (and sometimes simultaneously) cooperative and competitive ways.