RSS

Florian_Dietz

Karma: 286

Re­veal­ing al­ign­ment fak­ing with a sin­gle prompt

Florian_DietzJan 29, 2025, 9:01 PM
9 points
5 comments4 min readLW link

Flo­rian_Dietz’s Shortform

Florian_DietzJan 1, 2025, 2:27 PM
3 points
15 commentsLW link

Achiev­ing AI Align­ment through De­liber­ate Uncer­tainty in Mul­ti­a­gent Systems

Florian_DietzFeb 17, 2024, 8:45 AM
4 points
0 comments13 min readLW link

Un­der­stand­ing differ­ences be­tween hu­mans and in­tel­li­gence-in-gen­eral to build safe AGI

Florian_Dietz16 Aug 2022 8:27 UTC
7 points
8 comments1 min readLW link

logic puz­zles and loop­hole abuse

Florian_Dietz30 Sep 2017 15:45 UTC
3 points
4 comments3 min readLW link