RSS

Arthur Conmy

Karma: 1,651

Intepretability

Views my own

[Paper] All’s Fair In Love And Love: Copy Sup­pres­sion in GPT-2 Small

Oct 13, 2023, 6:32 PM
82 points
4 comments8 min readLW link

Three ways in­ter­pretabil­ity could be impactful

Arthur ConmySep 18, 2023, 1:02 AM
47 points
8 comments4 min readLW link

Mechanis­ti­cally in­ter­pret­ing time in GPT-2 small

Apr 16, 2023, 5:57 PM
68 points
6 comments21 min readLW link

RLHF does not ap­pear to differ­en­tially cause mode-collapse

Mar 20, 2023, 3:39 PM
95 points
9 comments3 min readLW link

OpenAI in­tro­duce ChatGPT API at 1/​10th the pre­vi­ous $/​token

Arthur ConmyMar 1, 2023, 8:48 PM
28 points
4 comments1 min readLW link
(openai.com)

Arthur Conmy’s Shortform

Arthur ConmyNov 1, 2022, 9:35 PM
2 points
1 commentLW link

Some Les­sons Learned from Study­ing Indi­rect Ob­ject Iden­ti­fi­ca­tion in GPT-2 small

Oct 28, 2022, 11:55 PM
101 points
9 comments9 min readLW link2 reviews
(arxiv.org)