RSS

Nina Rimsky

Karma: 1,396

https://​​ninarimsky.substack.com/​​

https://​​ninarimsky.com/​​

Steer­ing Llama-2 with con­trastive ac­ti­va­tion additions

2 Jan 2024 0:47 UTC
121 points
29 comments8 min readLW link
(arxiv.org)

A fram­ing for interpretability

Nina Rimsky14 Nov 2023 16:14 UTC
69 points
5 comments4 min readLW link
(ninarimsky.substack.com)