I think there’s some chance of models executing treacherous turns in response to a particular input, and I’d rather not trigger those if the model hasn’t been sufficiently sandboxed.
I think there’s some chance of models executing treacherous turns in response to a particular input, and I’d rather not trigger those if the model hasn’t been sufficiently sandboxed.