RSS

RowanWang

Karma: 280

https://​​rowankwang.com/​​

Mod­ify­ing LLM Beliefs with Syn­thetic Doc­u­ment Finetuning

Apr 24, 2025, 9:15 PM
68 points
11 comments2 min readLW link
(alignment.anthropic.com)

Some Les­sons Learned from Study­ing Indi­rect Ob­ject Iden­ti­fi­ca­tion in GPT-2 small

Oct 28, 2022, 11:55 PM
101 points
9 comments9 min readLW link2 reviews
(arxiv.org)

Gears-Level Men­tal Models of Trans­former Interpretability

RowanWangMar 29, 2022, 8:09 PM
72 points
4 comments6 min readLW link

Les­sons After a Cou­ple Months of Try­ing to Do ML Research

RowanWangMar 22, 2022, 11:45 PM
70 points
8 comments6 min readLW link