RSS

Florian_Dietz

Karma: 286

Edge Cases in AI Alignment

Florian_DietzMar 24, 2025, 9:27 AM
19 points
3 comments4 min readLW link

Split Per­son­al­ity Train­ing: Re­veal­ing La­tent Knowl­edge Through Per­son­al­ity-Shift Tokens

Florian_DietzMar 10, 2025, 4:07 PM
35 points
3 comments9 min readLW link

Do we want al­ign­ment fak­ing?

Florian_DietzFeb 28, 2025, 9:50 PM
7 points
4 comments1 min readLW link

Re­veal­ing al­ign­ment fak­ing with a sin­gle prompt

Florian_DietzJan 29, 2025, 9:01 PM
9 points
5 comments4 min readLW link

Flo­rian_Dietz’s Shortform

Florian_DietzJan 1, 2025, 2:27 PM
3 points
14 commentsLW link

Achiev­ing AI Align­ment through De­liber­ate Uncer­tainty in Mul­ti­a­gent Systems

Florian_DietzFeb 17, 2024, 8:45 AM
4 points
0 comments13 min readLW link

Un­der­stand­ing differ­ences be­tween hu­mans and in­tel­li­gence-in-gen­eral to build safe AGI

Florian_DietzAug 16, 2022, 8:27 AM
7 points
8 comments1 min readLW link

logic puz­zles and loop­hole abuse

Florian_DietzSep 30, 2017, 3:45 PM
3 points
4 comments3 min readLW link

a differ­ent per­specive on physics

Florian_DietzJun 26, 2017, 10:47 PM
0 points
15 comments3 min readLW link

Teach­ing an AI not to cheat?

Florian_DietzDec 20, 2016, 2:37 PM
5 points
12 comments1 min readLW link

con­trol­ling AI be­hav­ior through un­usual ax­io­matic probabilities

Florian_DietzJan 8, 2015, 5:00 PM
5 points
11 comments1 min readLW link

ques­tion: the 40 hour work week vs Sili­con Valley?

Florian_DietzOct 24, 2014, 12:09 PM
18 points
108 comments1 min readLW link

LessWrong’s at­ti­tude to­wards AI research

Florian_DietzSep 20, 2014, 3:02 PM
11 points
50 comments1 min readLW link