Arthur Conmy

Karma: 1,655

Intepretability

Views my own

[Paper] All’s Fair In Love And Love: Copy Suppression in GPT-2 Small

CallumMcDougall, Arthur Conmy, Cody Rushing, Tom McGrath and Neel Nanda

Oct 13, 2023, 6:32 PM

82 points

4 comments8 min readLW link

Three ways interpretability could be impactful

Arthur ConmySep 18, 2023, 1:02 AM

47 points

8 comments4 min readLW link

Mechanistically interpreting time in GPT-2 small

rgould, Elizabeth Ho and Arthur Conmy

Apr 16, 2023, 5:57 PM

68 points

6 comments21 min readLW link

RLHF does not appear to differentially cause mode-collapse

Arthur Conmy and beren

Mar 20, 2023, 3:39 PM

95 points

9 comments3 min readLW link

OpenAI introduce ChatGPT API at 1/10th the previous $/token

Arthur ConmyMar 1, 2023, 8:48 PM

28 points

4 comments1 min readLW link

(openai.com)

Arthur Conmy’s Shortform

Arthur ConmyNov 1, 2022, 9:35 PM

2 points

1 comment LW link

Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small

RowanWang, Alexandre Variengien, Arthur Conmy, Buck and jsteinhardt

Oct 28, 2022, 11:55 PM

101 points

9 comments9 min readLW link 2 reviews

(arxiv.org)