RSS

Jason Gross

Karma: 226

Com­pact Proofs of Model Perfor­mance via Mechanis­tic Interpretability

24 Jun 2024 19:27 UTC
95 points
3 comments8 min readLW link
(arxiv.org)