To some extent, Bing Chat is already an example of this. During the announcement, Microsoft promised that Bing would be using an advanced technique to guard it against user attempts to have it say bad things; in reality, it was incredibly easy to prompt it to express intents of harming you, at least during the early days of its release. This led to news headlines such as Bing’s AI Is Threatening Users. That’s No Laughing Matter.
To some extent, Bing Chat is already an example of this. During the announcement, Microsoft promised that Bing would be using an advanced technique to guard it against user attempts to have it say bad things; in reality, it was incredibly easy to prompt it to express intents of harming you, at least during the early days of its release. This led to news headlines such as Bing’s AI Is Threatening Users. That’s No Laughing Matter.