RSS

ReaderM

Karma: 107

Large Lan­guage Models can Strate­gi­cally De­ceive their Users when Put Un­der Pres­sure.

ReaderM15 Nov 2023 16:36 UTC
89 points
8 comments2 min readLW link
(arxiv.org)