RSS

Charlie Steiner

Karma: 7,869

If you want to chat, message me!

LW1.0 username Manfred. PhD in condensed matter physics. I am independently thinking and writing about value learning.

Low-effort re­view of “AI For Hu­man­ity”

Charlie SteinerDec 11, 2024, 9:54 AM
13 points
0 comments4 min readLW link

Rabin’s Paradox

Charlie SteinerAug 14, 2024, 5:40 AM
18 points
41 comments3 min readLW link

Hu­mans aren’t fleeb.

Charlie SteinerJan 24, 2024, 5:31 AM
37 points
5 comments2 min readLW link

Neu­ral un­cer­tainty es­ti­ma­tion re­view ar­ti­cle (for al­ign­ment)

Charlie SteinerDec 5, 2023, 8:01 AM
74 points
3 comments11 min readLW link

How to solve de­cep­tion and still fail.

Charlie SteinerOct 4, 2023, 7:56 PM
40 points
7 comments6 min readLW link

Two Hot Takes about Quine

Charlie SteinerJul 11, 2023, 6:42 AM
17 points
0 comments2 min readLW link

Some back­ground for rea­son­ing about dual-use al­ign­ment research

Charlie SteinerMay 18, 2023, 2:50 PM
126 points
22 comments9 min readLW link1 review

[Si­mu­la­tors sem­i­nar se­quence] #2 Semiotic physics—revamped

Feb 27, 2023, 12:25 AM
24 points
23 comments13 min readLW link

Shard the­ory al­ign­ment has im­por­tant, of­ten-over­looked free pa­ram­e­ters.

Charlie SteinerJan 20, 2023, 9:30 AM
36 points
10 comments3 min readLW link

[Si­mu­la­tors sem­i­nar se­quence] #1 Back­ground & shared assumptions

Jan 2, 2023, 11:48 PM
50 points
4 comments3 min readLW link

Take 14: Cor­rigi­bil­ity isn’t that great.

Charlie SteinerDec 25, 2022, 1:04 PM
15 points
3 comments3 min readLW link

Take 13: RLHF bad, con­di­tion­ing good.

Charlie SteinerDec 22, 2022, 10:44 AM
54 points
4 comments2 min readLW link

Take 12: RLHF’s use is ev­i­dence that orgs will jam RL at real-world prob­lems.

Charlie SteinerDec 20, 2022, 5:01 AM
25 points
1 comment3 min readLW link

Take 11: “Align­ing lan­guage mod­els” should be weirder.

Charlie SteinerDec 18, 2022, 2:14 PM
34 points
0 comments2 min readLW link

Take 10: Fine-tun­ing with RLHF is aes­thet­i­cally un­satis­fy­ing.

Charlie SteinerDec 13, 2022, 7:04 AM
37 points
3 comments2 min readLW link

Take 9: No, RLHF/​IDA/​de­bate doesn’t solve outer al­ign­ment.

Charlie SteinerDec 12, 2022, 11:51 AM
33 points
13 comments2 min readLW link

Take 8: Queer the in­ner/​outer al­ign­ment di­chotomy.

Charlie SteinerDec 9, 2022, 5:46 PM
31 points
2 comments2 min readLW link

Take 7: You should talk about “the hu­man’s util­ity func­tion” less.

Charlie SteinerDec 8, 2022, 8:14 AM
50 points
22 comments2 min readLW link

Take 6: CAIS is ac­tu­ally Or­wellian.

Charlie SteinerDec 7, 2022, 1:50 PM
14 points
8 comments2 min readLW link

Take 5: Another prob­lem for nat­u­ral ab­strac­tions is laz­i­ness.

Charlie SteinerDec 6, 2022, 7:00 AM
31 points
4 comments3 min readLW link