RSS

Ansh Radhakrishnan

Karma: 605

Ansh Rad­hakr­ish­nan’s Shortform

Ansh RadhakrishnanOct 10, 2024, 10:00 PM
5 points
2 comments1 min readLW link

Scal­able Over­sight and Weak-to-Strong Gen­er­al­iza­tion: Com­pat­i­ble ap­proaches to the same problem

Dec 16, 2023, 5:49 AM
76 points
4 comments6 min readLW link1 review

An­thropic Fall 2023 De­bate Progress Update

Ansh RadhakrishnanNov 28, 2023, 5:37 AM
75 points
9 comments12 min readLW link

Mea­sur­ing and Im­prov­ing the Faith­ful­ness of Model-Gen­er­ated Rea­son­ing

Jul 18, 2023, 4:36 PM
111 points
15 comments6 min readLW link1 review

Causal scrub­bing: re­sults on in­duc­tion heads

Dec 3, 2022, 12:59 AM
34 points
1 comment17 min readLW link

Causal scrub­bing: re­sults on a paren bal­ance checker

Dec 3, 2022, 12:59 AM
34 points
2 comments30 min readLW link

Causal scrub­bing: Appendix

Dec 3, 2022, 12:58 AM
18 points
4 comments20 min readLW link

Causal Scrub­bing: a method for rigor­ously test­ing in­ter­pretabil­ity hy­pothe­ses [Red­wood Re­search]

Dec 3, 2022, 12:58 AM
206 points
35 comments20 min readLW link1 review

The Bio An­chors Forecast

Ansh RadhakrishnanJun 2, 2022, 1:32 AM
13 points
0 comments3 min readLW link