RSS

Judd Rosenblatt

Karma: 1,376

CEO at AE Studio

Mis­tral Large 2 (123B) ex­hibits al­ign­ment faking

Mar 27, 2025, 3:39 PM
80 points
4 comments13 min readLW link

Re­duc­ing LLM de­cep­tion at scale with self-other over­lap fine-tuning

Mar 13, 2025, 7:09 PM
155 points
40 comments6 min readLW link

Align­ment can be the ‘clean en­ergy’ of AI

Feb 22, 2025, 12:08 AM
67 points
8 comments8 min readLW link

Mak­ing a con­ser­va­tive case for alignment

Nov 15, 2024, 6:55 PM
208 points
67 comments7 min readLW link

Science ad­vances one funeral at a time

Nov 1, 2024, 11:06 PM
100 points
9 comments2 min readLW link

Self-pre­dic­tion acts as an emer­gent regularizer

Oct 23, 2024, 10:27 PM
91 points
9 comments4 min readLW link

The case for a nega­tive al­ign­ment tax

Sep 18, 2024, 6:33 PM
75 points
20 comments7 min readLW link

The EA case for Trump

Judd RosenblattAug 3, 2024, 1:00 AM
14 points
1 comment1 min readLW link
(www.secondbest.ca)

Self-Other Over­lap: A Ne­glected Ap­proach to AI Alignment

Jul 30, 2024, 4:22 PM
215 points
51 comments12 min readLW link

Yoshua Ben­gio: Rea­son­ing through ar­gu­ments against tak­ing AI safety seriously

Judd RosenblattJul 11, 2024, 11:53 PM
70 points
3 comments1 min readLW link
(yoshuabengio.org)

There Should Be More Align­ment-Driven Startups

May 31, 2024, 2:05 AM
62 points
14 comments11 min readLW link

Key take­aways from our EA and al­ign­ment re­search sur­veys

May 3, 2024, 6:10 PM
111 points
10 comments21 min readLW link

AE Stu­dio @ SXSW: We need more AI con­scious­ness re­search (and fur­ther re­sources)

Mar 26, 2024, 8:59 PM
67 points
8 comments3 min readLW link

Sur­vey for al­ign­ment re­searchers!

Feb 2, 2024, 8:41 PM
71 points
11 comments1 min readLW link

The ‘Ne­glected Ap­proaches’ Ap­proach: AE Stu­dio’s Align­ment Agenda

Dec 18, 2023, 8:35 PM
175 points
22 comments12 min readLW link1 review