It only showed that when language models that are larger or have more RLHF training are simulating an “Assistant” character they exhibit more of these behaviours.
Since Sydney is supposed to be an assistant character, and since you expect future such systems for assisting users to be deployed with such assistant persona, that’s all the paper needs to show to explain Sydney & future Sydney-like behaviors.
Since Sydney is supposed to be an assistant character, and since you expect future such systems for assisting users to be deployed with such assistant persona, that’s all the paper needs to show to explain Sydney & future Sydney-like behaviors.