RSS

Florian_Dietz

Karma: 306

Re­veal­ing al­ign­ment fak­ing with a sin­gle prompt

Florian_DietzJan 29, 2025, 9:01 PM
9 points
5 comments4 min readLW link

Flo­rian_Dietz’s Shortform

Florian_DietzJan 1, 2025, 2:27 PM
3 points
28 commentsLW link

Achiev­ing AI Align­ment through De­liber­ate Uncer­tainty in Mul­ti­a­gent Systems

Florian_DietzFeb 17, 2024, 8:45 AM
4 points
0 comments13 min readLW link