RSS

Diogo de Lucena

Karma: 585

Chief Scientist at AE Studio

Re­duc­ing LLM de­cep­tion at scale with self-other over­lap fine-tuning

Mar 13, 2025, 7:09 PM
136 points
33 comments5 min readLW link

Science ad­vances one funeral at a time

Nov 1, 2024, 11:06 PM
98 points
9 comments2 min readLW link

Self-pre­dic­tion acts as an emer­gent regularizer

Oct 23, 2024, 10:27 PM
91 points
9 comments4 min readLW link

The case for a nega­tive al­ign­ment tax

Sep 18, 2024, 6:33 PM
75 points
20 comments7 min readLW link

Self-Other Over­lap: A Ne­glected Ap­proach to AI Alignment

Jul 30, 2024, 4:22 PM
213 points
49 comments12 min readLW link

Video In­tro to Guaran­teed Safe AI

Jul 11, 2024, 5:53 PM
27 points
0 comments1 min readLW link
(youtu.be)

AE Stu­dio @ SXSW: We need more AI con­scious­ness re­search (and fur­ther re­sources)

Mar 26, 2024, 8:59 PM
67 points
8 comments3 min readLW link