RSS

Wuschel Schulz

Karma: 297

A short ‘deriva­tion’ of Watan­abe’s Free En­ergy Formula

Wuschel SchulzJan 29, 2024, 11:41 PM
13 points
6 comments7 min readLW link

Steer­ing Llama-2 with con­trastive ac­ti­va­tion additions

Jan 2, 2024, 12:47 AM
125 points
29 comments8 min readLW link
(arxiv.org)

Si­mu­la­tors In­crease the Like­li­hood of Align­ment by Default

Wuschel SchulzApr 30, 2023, 4:32 PM
13 points
1 comment5 min readLW link

If Went­worth is right about nat­u­ral ab­strac­tions, it would be bad for alignment

Wuschel SchulzDec 8, 2022, 3:19 PM
29 points
5 comments4 min readLW link

A caveat to the Orthog­o­nal­ity Thesis

Wuschel SchulzNov 9, 2022, 3:06 PM
38 points
10 comments2 min readLW link