RSS

Judd Rosenblatt

Karma: 1,285

CEO at AE Studio

Re­duc­ing LLM de­cep­tion at scale with self-other over­lap fine-tuning

Mar 13, 2025, 7:09 PM
135 points
28 comments6 min readLW link

Align­ment can be the ‘clean en­ergy’ of AI

Feb 22, 2025, 12:08 AM
66 points
8 comments8 min readLW link

Mak­ing a con­ser­va­tive case for alignment

Nov 15, 2024, 6:55 PM
208 points
68 comments7 min readLW link

Science ad­vances one funeral at a time

Nov 1, 2024, 11:06 PM
98 points
9 comments2 min readLW link

Self-pre­dic­tion acts as an emer­gent regularizer

Oct 23, 2024, 10:27 PM
91 points
9 comments4 min readLW link

The case for a nega­tive al­ign­ment tax

Sep 18, 2024, 6:33 PM
75 points
20 comments7 min readLW link

The EA case for Trump

Judd RosenblattAug 3, 2024, 1:00 AM
9 points
1 comment1 min readLW link
(www.secondbest.ca)

Self-Other Over­lap: A Ne­glected Ap­proach to AI Alignment

Jul 30, 2024, 4:22 PM
213 points
49 comments12 min readLW link

Yoshua Ben­gio: Rea­son­ing through ar­gu­ments against tak­ing AI safety seriously

Judd RosenblattJul 11, 2024, 11:53 PM
70 points
3 comments1 min readLW link
(yoshuabengio.org)