There have been quite a few previous papers on backdooring models that have also demonstrated the feasibility of this. So anyone operating under that impression hasn’t been reading the literature.
That is a big part of the threat here. Many of the current deployments are many steps removed from anyone reading research papers. E.g. sure, people at MS and OpenAI involved with that roll-out are presumably up on the literature. But the IT director deciding when and how to deploy copilot, what controls need to be in place, etc? Trade publications, blogs, maybe they ask around on Reddit to see what others are doing.
There have been quite a few previous papers on backdooring models that have also demonstrated the feasibility of this. So anyone operating under that impression hasn’t been reading the literature.
That is a big part of the threat here. Many of the current deployments are many steps removed from anyone reading research papers. E.g. sure, people at MS and OpenAI involved with that roll-out are presumably up on the literature. But the IT director deciding when and how to deploy copilot, what controls need to be in place, etc? Trade publications, blogs, maybe they ask around on Reddit to see what others are doing.