RSS

Florian_Dietz

Karma: 228

Do we want al­ign­ment fak­ing?

Florian_Dietz28 Feb 2025 21:50 UTC
7 points
2 comments1 min readLW link

Re­veal­ing al­ign­ment fak­ing with a sin­gle prompt

Florian_Dietz29 Jan 2025 21:01 UTC
9 points
5 comments4 min readLW link

Flo­rian_Dietz’s Shortform

Florian_Dietz1 Jan 2025 14:27 UTC
3 points
13 comments1 min readLW link