Jesse Hoogland comments on You’re Measuring Model Complexity Wrong

Jesse Hoogland 16 Nov 2023 9:17 UTC
LW: 3 AF: 1
0
AF
I think there’s some chance of models executing treacherous turns in response to a particular input, and I’d rather not trigger those if the model hasn’t been sufficiently sandboxed.